| h1. Installation |
| |
| The Python cloud scripts enable you to run Hadoop on cloud providers. |
| A working cluster will start immediately with one command. It's ideal for |
| running temporary Hadoop clusters to carry out a proof of concept, or to run a |
| few one-time jobs. Currently, the scripts support Amazon EC2 only, but in the |
| future other cloud providers may also be supported. |
| |
| Amazon Machine Images (AMIs) and associated launch scripts are provided that |
| make it easy to run Hadoop on EC2. Note that the AMIs contain only base packages |
| (such as Java), and not a particular version of Hadoop because Hadoop is |
| installed at launch time. |
| |
| *In this section, command lines that start with {{#}} are executed on a cloud |
| instance, and command lines starting with a {{%}} are executed on your |
| workstation.* |
| |
| h2. Installing the Python Cloud Scripts |
| |
| The following prerequisites apply to using the Python cloud scripts: |
| * Python 2.5 |
| * boto 1.8d |
| * simplejson 2.0.9 |
| |
| You can install bot and simplejson by using |
| [easy\_install|http://pypi.python.org/pypi/setuptools]: |
| {code} |
| % easy_install "simplejson==2.0.9" |
| % easy_install "boto==1.8d" |
| {code} |
| |
| *NOTE: If you have both Python 2.5 and 2.6 on your system (e.g. OS X Snow Leopard), then you should use {{easy_install-2.5}}.* |
| |
| Alternatively, you might like to use the python-boto and python-simplejson RPM |
| and Debian packages. |
| |
| The Python Cloud scripts are packaged in the source tarball. Unpack the tarball |
| on your system. The CDH Cloud scripts are in _contrib/python/src/py_. |
| For convenience, you can add this directory to your path. |
| |