These notes are for Pig 0.17.0 release.
The highlights of this release includes Pig on Spark
System Requirements
1. Java 1.7.x or newer, preferably from Sun. Set JAVA_HOME to the root of your
Java installation
2. Ant build tool: - to build source only
3. Run under Unix and Windows
4. This release is compatible with Hadoop 2.5+ releases. Note Hadoop 1.X is not
supported anymore.
5. For using Spark execution engine Spark 1.6.x is required. (Spark 2 support is
likely to come in the next release)
Trying the Release
1. Download pig-0.17.0.tar.gz
2. Unpack the file: tar -xzvf pig-0.17.0.tar.gz
3. Move into the installation directory: cd pig-0.17.0
4. To run pig without Hadoop cluster, execute the command below. This will
take you into an interactive shell called grunt that allows you to navigate
the local file system and execute Pig commands against the local files
bin/pig -x local
5. To run on your Hadoop cluster, you need to set PIG_CLASSPATH environment
variable to point to the directory with your hadoop-site.xml file and then run
pig. The commands below will take you into an interactive shell called grunt
that allows you to navigate Hadoop DFS and execute Pig commands against it
export PIG_CLASSPATH=/hadoop/conf
6. To build your own version of pig.jar run
7. To run unit tests run
ant test
8. To build jar file with available user defined functions run commands below.
cd contrib/piggybank/java
9. To build the tutorial:
cd tutorial
10. To run tutorial follow instructions in
Relevant Documentation
Pig Documentation:
Pig Wiki:
Pig Tutorial: