blob: a349a9d0c6ee4dbcd08ab9df57c9830447aa6125 [file] [log] [blame]
h1. Installation
The Python cloud scripts enable you to run Hadoop on cloud providers.
A working cluster will start immediately with one command. It's ideal for
running temporary Hadoop clusters to carry out a proof of concept, or to run a
few one-time jobs. Currently, the scripts support Amazon EC2 only, but in the
future other cloud providers may also be supported.
Amazon Machine Images (AMIs) and associated launch scripts are provided that
make it easy to run Hadoop on EC2. Note that the AMIs contain only base packages
(such as Java), and not a particular version of Hadoop because Hadoop is
installed at launch time.
*In this section, command lines that start with {{#}} are executed on a cloud
instance, and command lines starting with a {{%}} are executed on your
workstation.*
h2. Installing the Python Cloud Scripts
The following prerequisites apply to using the Python cloud scripts:
* Python 2.5
* boto 1.8d
* simplejson 2.0.9
You can install bot and simplejson by using
[easy\_install|http://pypi.python.org/pypi/setuptools]:
{code}
% easy_install "simplejson==2.0.9"
% easy_install "boto==1.8d"
{code}
*NOTE: If you have both Python 2.5 and 2.6 on your system (e.g. OS X Snow Leopard), then you should use {{easy_install-2.5}}.*
Alternatively, you might like to use the python-boto and python-simplejson RPM
and Debian packages.
The Python Cloud scripts are packaged in the source tarball. Unpack the tarball
on your system. The CDH Cloud scripts are in _contrib/python/src/py_.
For convenience, you can add this directory to your path.