| Build instructions for Tez |
| |
| ---------------------------------------------------------------------------------- |
| Requirements: |
| |
| * JDK 1.6+ |
| * Maven 3.0 or later |
| * Findbugs 2.0.2 or later (if running findbugs) |
| * ProtocolBuffer 2.5.0 |
| * Internet connection for first build (to fetch all dependencies) |
| |
| ---------------------------------------------------------------------------------- |
| Maven main modules: |
| |
| tez................................(Main Tez project) |
| - tez-api .....................(Tez api) |
| - tez-common ..................(Tez common) |
| - tez-runtime-internals .......(Tez runtime internals) |
| - tez-runtime-library .........(Tez runtime library) |
| - tez-mapreduce ...............(Tez mapreduce) |
| - tez-dag .....................(Tez dag) |
| - tez-mapreduce-examples ......(Tez mapreduce examples) |
| - tez-tests ...................(Tez tests) |
| - tez-dist ....................(Tez dist) |
| |
| ---------------------------------------------------------------------------------- |
| Maven build goals: |
| |
| * Clean : mvn clean |
| * Compile : mvn compile |
| * Run tests : mvn test |
| * Create JAR : mvn package |
| * Run findbugs : mvn compile findbugs:findbugs |
| * Run checkstyle : mvn compile checkstyle:checkstyle |
| * Install JAR in M2 cache : mvn install |
| * Deploy JAR to Maven repo : mvn deploy |
| * Run clover : mvn test -Pclover [-Dclover.license=${user.home}/clover.license] |
| * Run Rat : mvn apache-rat:check |
| * Build javadocs : mvn javadoc:javadoc |
| * Build distribution : mvn package[-Dtar][-Dhadoop.version=2.2.0] |
| * Visualize state machines : mvn compile -Pvisualize -DskipTests=true |
| |
| Build options: |
| |
| * Use -Dtar to create a TAR with the distribution (tar.gz will be created under /tez-dist/target) |
| * Use -Dclover.license to specify the path to the clover license file |
| * Use -Dhadoop.version to specify the version of hadoop to build tez against |
| * Use -Dprotoc.path to specify the path to protoc |
| |
| Tests options: |
| |
| * Use -DskipTests to skip tests when running the following Maven goals: |
| 'package', 'install', 'deploy' or 'verify' |
| * -Dtest=<TESTCLASSNAME>,<TESTCLASSNAME#METHODNAME>,.... |
| * -Dtest.exclude=<TESTCLASSNAME> |
| * -Dtest.exclude.pattern=**/<TESTCLASSNAME1>.java,**/<TESTCLASSNAME2>.java |
| |
| ---------------------------------------------------------------------------------- |
| Building against a specific version of hadoop: |
| |
| Tez runs on top of Apache Hadoop YARN and requires hadoop version 2.2.0 or higher |
| For example to build tez against hadoop 3.0.0-SNAPSHOT |
| |
| $ mvn package -Dtar -Dhadoop.version=3.0.0-SNAPSHOT |
| |
| To skip Tests and java docs |
| |
| $ mvn package -Dtar -Dhadoop.version=3.0.0-SNAPSHOT -DskipTests -Dmaven.javadoc.skip=true |
| |
| ---------------------------------------------------------------------------------- |
| Protocol Buffer compiler: |
| |
| The version of Protocol Buffer compiler, protoc, must be 2.5.0 and match the |
| version of the protobuf JAR. |
| |
| If you have multiple versions of protoc in your system, you can set in your |
| build shell the PROTOC_PATH environment variable to point to the one you |
| want to use for the Tez build. If you don't define this environment variable, |
| protoc is looked up in the PATH. |
| |
| You can also specify the path to protoc while building using -Dprotoc.path |
| |
| $ mvn package -DskipTests -Dtar -Dprotoc.path=/usr/local/bin/protoc |
| |
| |
| ---------------------------------------------------------------------------------- |
| Building the docs: |
| |
| The following commands will build a local copy of the Apache Tez website under docs |
| $ cd docs; mvn site |
| |
| ---------------------------------------------------------------------------------- |
| Building components separately: |
| |
| If you are building a submodule directory, all the Tez dependencies this |
| submodule has will be resolved as all other 3rd party dependencies. This is, |
| from the Maven cache or from a Maven repository (if not available in the cache |
| or the SNAPSHOT 'timed out'). |
| An alternative is to run 'mvn install -DskipTests' from Tez source top |
| level once; and then work from the submodule. Keep in mind that SNAPSHOTs |
| time out after a while, using the Maven '-nsu' will stop Maven from trying |
| to update SNAPSHOTs from external repos. |
| |
| ---------------------------------------------------------------------------------- |
| Visualize the State Machines used in Tez internals: |
| |
| Use -Pvisualize to generate a graphviz file named Tez.gv which can then be |
| converted into a state machine diagram that represents the state transitions of |
| the state machine for the classses provided. |
| |
| Optional parameters: |
| * -Dtez.dag.state.classes=<comma-separated list of classes> |
| - By default, all 4 state machines - DAG, Vertex, Task and TaskAttempt are generated. |
| * -Dtez.graphviz.title |
| - Title for the Graph ( Default is Tez ) |
| * -Dtez.graphviz.output.file |
| - Output file to be generated with the state machines ( Default is Tez.gv ) |
| |
| For example, to generate the state machine graphviz file for DAGImpl, run: |
| |
| $ mvn compile -Pvisualize -Dtez.dag.state.classes=org.apache.tez.dag.app.dag.impl.DAGImpl -DskipTests=true |
| |
| To generate the diagram, you can use a Graphviz application or something like: |
| |
| $ dot -Tpng -o Tez.png Tez.gv' |