| These notes are for Pig 0.11.0 release. |
| |
| Highlights |
| ========== |
| |
| This release include several new features and improvements, some of which are |
| highlighted below. For a more complete list of changes see CHANGES.txt. |
| |
| - New RANK, CUBE and ROLLUP operators |
| - New DateType data type |
| - Support for Groovy UDFs |
| - Support for loading macros from jars |
| - Support for custom PigReducerEstimators |
| - Suoport for custom PigProgressNotificatonListeners |
| - Support for schema-based Tuples for reduced memory footprint |
| - Support for passing environment variables to streaming jobs |
| - Support for invoking HCatalog DDL commands from Pig |
| - Support for .pigbootup file for defaults |
| - Improved support for working with Maps in Pig scripts |
| - Grunt improvements: history and clear |
| - New cleanupOnSuccess method in StoreFunc interface |
| - UDF timing utilities |
| - UDF lifecycle improvements |
| - UDFs for DateType support |
| - Performance improvements to merge join |
| - Performance improvements to local mode |
| - Performance improvements to in memory aggregation |
| - Performance improvements to Spillable management |
| - Improvements to HBaseStorage and AvroStorage |
| - Penny has been removed |
| - 300+ bug fixes |
| |
| |
| System Requirements |
| =================== |
| |
| 1. Java 1.6.x or newer, preferably from Sun. Set JAVA_HOME to the root of your |
| Java installation |
| 2. Ant build tool: http://ant.apache.org - to build source only |
| 3. Cygwin: http://www.cygwin.com/ - to run under Windows |
| 4. This release is compatible with all Hadoop 0.20.X, 1.X, 0.23.X and 2.X releases |
| |
| Trying the Release |
| ================== |
| |
| 1. Download pig-0.11.0.tar.gz |
| 2. Unpack the file: tar -xzvf pig-0.11.0.tar.gz |
| 3. Move into the installation directory: cd pig-0.11.0 |
| 4. To run pig without Hadoop cluster, execute the command below. This will |
| take you into an interactive shell called grunt that allows you to navigate |
| the local file system and execute Pig commands against the local files |
| bin/pig -x local |
| 5. To run on your Hadoop cluster, you need to set PIG_CLASSPATH environment |
| variable to point to the directory with your hadoop-site.xml file and then run |
| pig. The commands below will take you into an interactive shell called grunt |
| that allows you to navigate Hadoop DFS and execute Pig commands against it |
| export PIG_CLASSPATH=/hadoop/conf |
| bin/pig |
| 6. To build your own version of pig.jar run |
| ant |
| 7. To run unit tests run |
| ant test |
| 8. To build jar file with available user defined functions run commands below. |
| cd contrib/piggybank/java |
| ant |
| 9. To build the tutorial: |
| cd tutorial |
| ant |
| 10. To run tutorial follow instructions in http://wiki.apache.org/pig/PigTutorial |
| |
| Relevant Documentation |
| ====================== |
| |
| Pig Language Manual(including Grunt commands): |
| http://wiki.apache.org/pig-data/attachments/FrontPage/attachments/plrm.htm |
| UDF Manual: http://wiki.apache.org/pig/UDFManual |
| Piggy Bank: http://wiki.apache.org/pig/PiggyBank |
| Pig Tutorial: http://wiki.apache.org/pig/PigTutorial |
| Pig Eclipse Plugin (PigPen): http://wiki.apache.org/pig/PigPen |