blob: f7c3f0cf3d6c11c742e64efa1be1290e2df595f6 [file] [log] [blame]
Welcome to Apache Crunch!
=========================
Apache Crunch is a Java library for writing, testing, and running Hadoop
MapReduce pipelines, based on Google's FlumeJava. Its goal is to make
pipelines that are composed of many user-defined functions simple to write,
easy to test, and efficient to run.
For more information please see the website:
http://crunch.apache.org/
Building the Source Code
------------------------
We recommend Maven 3 and JDK 6 for building Crunch. To build the project
run the following Maven command:
mvn package
To run the integration test suite and to install the created JARs in your
local Maven cache:
mvn install
Crunch has experimental support for Hadoop 2 through the "hadoop-2" build
profile (add -Phadoop-2 to enable it). If you want to use HBase support on
Hadoop 2, please note that you have to build HBase 0.94.3 from source using
the following command:
mvn clean install -Dhadoop.profile=2.0