We have ported version 0.20.205.0 of Hadoop to run on Mesos. Most of the Mesos port is implemented by a pluggable Hadoop scheduler, which communicates with Mesos to receive nodes to launch tasks on. However, a few small additions to Hadoop's internal APIs are also required.

You can build the ported version of Hadoop using make hadoop. It gets placed in the hadoop/hadoop-0.20.205.0 directory. However, if you want to patch your own version of Hadoop to add Mesos support, you can also use .patch files located in <Mesos directory>/hadoop. These patches are likely to work on other Hadoop versions derived from 0.20. For example, for Cloudera's Distribution, GitHub user patelh has already created a Mesos-compatible version of CDH3u3.

To run Hadoop on Mesos, follow these steps:

Note that when you run on a cluster, Hadoop (and Mesos) should be located on the same path on all nodes.

If you wish to run multiple JobTrackers, the easiest way is to give each one a different port by using a different Hadoop conf directory for each one and passing the --conf flag to bin/hadoop to specify which config directory to use. You can copy Hadoop's existing conf directory to a new location and modify it to achieve this.

Hadoop Versions with Mesos Support Available