| To compile Hadoop Mapreduce next following, do the following: |
| |
| Step 1) Install dependencies for yarn |
| |
| See http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-porject/hadoop-yarn/README |
| Make sure protbuf library is in your library path or set: export LD_LIBRARY_PATH=/usr/local/lib |
| |
| Step 2) Checkout |
| |
| svn checkout http://svn.apache.org/repos/asf/hadoop/common/trunk |
| |
| Step 3) Build |
| |
| Go to common directory - choose your regular common build command. For example: |
| |
| export MAVEN_OPTS=-Xmx512m |
| mvn clean package -Pdist -Dtar -DskipTests -Pnative |
| |
| You can omit -Pnative it you don't want to build native packages. |
| |
| Step 4) Untar the tarball from hadoop-dist/target/ into a clean and different |
| directory, say HADOOP_YARN_HOME. |
| |
| Step 5) |
| Start hdfs |
| |
| To run Hadoop Mapreduce next applications: |
| |
| Step 6) export the following variables to where you have things installed: |
| You probably want to export these in hadoop-env.sh and yarn-env.sh also. |
| |
| export HADOOP_MAPRED_HOME=<mapred loc> |
| export HADOOP_COMMON_HOME=<common loc> |
| export HADOOP_HDFS_HOME=<hdfs loc> |
| export HADOOP_YARN_HOME=directory where you untarred yarn |
| export HADOOP_CONF_DIR=<conf loc> |
| export YARN_CONF_DIR=$HADOOP_CONF_DIR |
| |
| Step 7) Setup config: for running mapreduce applications, which now are in user land, you need to setup nodemanager with the following configuration in your yarn-site.xml before you start the nodemanager. |
| <property> |
| <name>yarn.nodemanager.aux-services</name> |
| <value>mapreduce.shuffle</value> |
| </property> |
| |
| <property> |
| <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> |
| <value>org.apache.hadoop.mapred.ShuffleHandler</value> |
| </property> |
| |
| Step 8) Modify mapred-site.xml to use yarn framework |
| <property> |
| <name> mapreduce.framework.name</name> |
| <value>yarn</value> |
| </property> |
| |
| Step 9) cd $HADOOP_YARN_HOME |
| |
| Step 10) sbin/yarn-daemon.sh start resourcemanager |
| |
| Step 11) sbin/yarn-daemon.sh start nodemanager |
| |
| Step 12) sbin/mr-jobhistory-daemon.sh start historyserver |
| |
| Step 13) You are all set, an example on how to run a mapreduce job is: |
| cd $HADOOP_MAPRED_HOME |
| ant examples -Dresolvers=internal |
| $HADOOP_COMMON_HOME/bin/hadoop jar $HADOOP_MAPRED_HOME/build/hadoop-mapreduce-examples-*.jar randomwriter -Dmapreduce.job.user.name=$USER -Dmapreduce.randomwriter.bytespermap=10000 -Ddfs.blocksize=536870912 -Ddfs.block.size=536870912 -libjars $HADOOP_YARN_HOME/modules/hadoop-mapreduce-client-jobclient-*.jar output |
| |
| The output on the command line should be almost similar to what you see in the JT/TT setup (Hadoop 0.20/0.21) |
| |