INSTALL.txt - tez - Git at Google

 How to run MR on TEZ
 =======================

 Tez provides an ApplicationMaster that can run MR or MRR jobs. There is a
 translation layer implemented ( may have bugs so please file JIRAs if you
 come across any issues ) that allows a user to run an MR job against the TEZ
 DAG ApplicationMaster.

 NOTE: Due to some quirks, all jobs will show an IOException and a failure at
 the end. Please look at the YARN ResourceManager UI to check the correct result
 of the job.

 Install/Deploy Instructions
 ===========================

 1) Deploy Apache Hadoop using the 3.0.0-SNAPSHOT from trunk.
    - You can also use the 2.1.0 branch. For this, the only change will be to build
      tez with the -Dhadoop.version set to the correct version matching the hadoop
      branch being used.
 2) Copy the tez jars and their dependencies into HDFS.
 3) Configure tez-site.xml to set tez.lib.uris to point to the paths in HDFS containing
    the jars. Please note that the paths are not searched recursively so for <basedir>
    and <basedir>/lib/, you will need to configure the 2 paths as a comma-separated list.
 3) Modify mapred-site.xml to change "mapreduce.framework.name" property from its
    default value of "yarn" to "yarn-tez"
 4) set HADOOP_CLASSPATH to have the following paths in it:
    - TEZ_CONF_DIR - location of tez-site.xml
    - TEZ_JARS and TEZ_JARS/libs - location of the tez jars and dependencies.
 5) Submit a MR job as you normally would using something like:

 $HADOOP_PREFIX/bin/hadoop jar hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT-tests.jar sleep -mt 1 -rt 1 -m 1 -r 1

 This will use the TEZ DAG ApplicationMaster to run the MR job. This can be
 verified by looking at the AM's logs from the YARN ResourceManager UI.

 6) There is a basic example of using an MRR job in the tez-mapreduce-examples.jar. Refer to OrderedWordCount.java
 in the source code. To run this example:

 $HADOOP_PREFIX/bin/hadoop jar tez-mapreduce-examples.jar orderedwordcount <input> <output>

 This will use the TEZ DAG ApplicationMaster to run the ordered word count job. This job is similar
 to the word count example except that it also orders all words based on the frequency of
 occurrence.
	How to run MR on TEZ
	=======================

	Tez provides an ApplicationMaster that can run MR or MRR jobs. There is a
	translation layer implemented ( may have bugs so please file JIRAs if you
	come across any issues ) that allows a user to run an MR job against the TEZ
	DAG ApplicationMaster.

	NOTE: Due to some quirks, all jobs will show an IOException and a failure at
	the end. Please look at the YARN ResourceManager UI to check the correct result
	of the job.

	Install/Deploy Instructions
	===========================

	1) Deploy Apache Hadoop using the 3.0.0-SNAPSHOT from trunk.
	- You can also use the 2.1.0 branch. For this, the only change will be to build
	tez with the -Dhadoop.version set to the correct version matching the hadoop
	branch being used.
	2) Copy the tez jars and their dependencies into HDFS.
	3) Configure tez-site.xml to set tez.lib.uris to point to the paths in HDFS containing
	the jars. Please note that the paths are not searched recursively so for <basedir>
	and <basedir>/lib/, you will need to configure the 2 paths as a comma-separated list.
	3) Modify mapred-site.xml to change "mapreduce.framework.name" property from its
	default value of "yarn" to "yarn-tez"
	4) set HADOOP_CLASSPATH to have the following paths in it:
	- TEZ_CONF_DIR - location of tez-site.xml
	- TEZ_JARS and TEZ_JARS/libs - location of the tez jars and dependencies.
	5) Submit a MR job as you normally would using something like:

	$HADOOP_PREFIX/bin/hadoop jar hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT-tests.jar sleep -mt 1 -rt 1 -m 1 -r 1

	This will use the TEZ DAG ApplicationMaster to run the MR job. This can be
	verified by looking at the AM's logs from the YARN ResourceManager UI.

	6) There is a basic example of using an MRR job in the tez-mapreduce-examples.jar. Refer to OrderedWordCount.java
	in the source code. To run this example:

	$HADOOP_PREFIX/bin/hadoop jar tez-mapreduce-examples.jar orderedwordcount <input> <output>

	This will use the TEZ DAG ApplicationMaster to run the ordered word count job. This job is similar
	to the word count example except that it also orders all words based on the frequency of
	occurrence.