| Pig Change Log |
| |
| Trunk (unreleased changes) |
| |
| INCOMPATIBLE CHANGES |
| |
| IMPROVEMENTS |
| |
| OPTIMIZATIONS |
| |
| BUG FIXES |
| |
| Release 0.1.1 - Unreleased |
| |
| INCOMPATIBLE CHANGES |
| |
| NEW FEATURES |
| |
| IMPROVEMENTS |
| |
| PIG-253: integration with hadoop-18 |
| |
| BUG FIXES |
| |
| PIG-342: Fix DistinctDataBag to recalculate size after it has spilled. (bdimcheff via gates) |
| |
| Release 0.1.0 - - 2008-09-11 |
| |
| INCOMPATIBLE CHANGES |
| |
| PIG-123: requires escape of '\' in chars and string |
| |
| NEW FEATURES |
| |
| PIG-20 Added custom comparator functions for order by (phunt via gates) |
| PIG-94: Streaming implementation |
| PIG-58: parameter substitution |
| PIG-55: added custom splitter (groves via olgan) |
| PIG-59: Add a new ILLUSTRATE command (shubhamc via gates). |
| PIG-256: Added variable argument support for UDFs (pi_song) |
| |
| IMPROVEMENTS |
| |
| PIG-8 added binary comparator (olgan) |
| |
| PIG-11 Add capability to search for jar file to register. (antmagna via |
| olgan) |
| |
| PIG-7: Added use of combiner in some restricted cases. (gates) |
| |
| PIG-47: Added methods to DataMap to provide access to its content |
| |
| PIG-30: Rewrote DataBags to better handle decisions of when to spill to |
| disk and to spill more intelligently. (gates) |
| |
| PIG-12: Added time stamps to log4j messages (phunt via gates). |
| |
| PIG-44: Added adaptive decision of the number of records to hold in memory |
| before spilling (utkarsh) |
| |
| PIG-56: Made DataBag implement Iterable. (groves via gates) |
| |
| PIG-39: created more efficient version of read (spullara via olgan) |
| |
| PIG-32: ABstraction layer (olgan) |
| |
| PIG-83: Change everything except grunt and Main (PigServer on down) to use |
| common logging abstraction instead of log4j. By default in grunt, log4j |
| still used as logging layer. Also converted all System.out/err.println |
| statements to use logging instead. (francisoud via gates) |
| |
| PIG-13: adding version to the system (joa23 via olgan) |
| |
| PIG-113: Make explain output more understandable (pi_song via gates) |
| |
| PIG-120: Support map reduce in local mode. To do this user needs to |
| specify execution type as mapreduce and cluster name as local (joa23 via |
| gates). |
| |
| PIG-106: Change StringBuffer and String '+' to StringBuilder (francisoud |
| via gates). |
| |
| PIG-111: Reworked configuration to be setable via properties. |
| (Contributions from joa23, pi_song, and oae via gates). |
| |
| BUG FIXES |
| |
| PIG-24 Files that were incorrectly placed under test/reports have been |
| removed. ant clean now cleans test/reports. (milindb via gates) |
| |
| PIG-25 com.yahoo.pig dir left under pig/test by mistake. removed it (olgan@) |
| |
| PIG-23 Made pig work with java 1.5. (milindb via gates) |
| |
| PIG-17 integrated with Hadoop 0.15 (olgan@) |
| |
| PIG-33 Help was commented out - uncommented (olgan) |
| |
| PIG-31: second half of concurrent mode problem addressed (olgan) |
| |
| PIG-14: added heartbeat functionality (olgan) |
| |
| PIG-17: updated hadoop15.jar to match hadoop 0.15.1 release |
| |
| PIG-29: fixed bag factory to be properly initialized (utkarsh) |
| |
| PIG-43: fixed problem where using the combiner prevented a pig alias |
| from being evaluated more than once. (gates) |
| |
| PIG-45: Fixed pig.pl to not assume hodrc file is named the same as |
| cluster name (gates). |
| |
| PIG-7 (more): Fixed bug in PigCombiner where it was writing IndexedTuples |
| instead of Tuples, causing Reducer to crash in some cases. |
| |
| PIG-41: Added patterns to svn:ignore |
| |
| PIG-51: Fixed combiner in the presence of flattening |
| |
| PIG-61: Fixed MapreducePlanCompiler to use PigContext to load up the |
| comparator function instead of Class.forName. (gates) |
| |
| PIG-63: Fix for non-ascii UTF-8 data (breed@ and olgan@) |
| |
| PIG-77: Added eclipse specific files to svn:ignore |
| |
| PIG-57: Fixed NPE in PigContext.fixUpDomain (francisoud via gates) |
| |
| PIG-69: NPE in PigContext.setJobtrackerLocation (francisoud via gates) |
| |
| PIG-78: src/org/apache/pig/builtin/PigStorage.java doesn't compile (arun |
| via olgan) |
| |
| PIG-87: Fix pig.pl to find java via JAVA_HOME instead of hardcoded default |
| path. Also fix it to not die if pigclient.conf is missing. (craigm via |
| gates). |
| |
| PIG-89: Fix DefaultDataBag, DistinctDataBag, SortedDataBag to close spill |
| files when they are done spilling (contributions by craigm, breed, and |
| gates, committed by gates). |
| |
| PIG-95: Remove System.exit() statements from inside pig (joa23 via gates). |
| |
| PIG-65: convert tabs to spaces (groves via olgan) |
| |
| PIG-97: Turn off combiner in the case of Cogroup, as it doesn't work when |
| more than one bag is involved (gates). |
| |
| PIG-92: Fix NullPointerException in PIgContext due to uninitialized conf |
| reference. (francisoud via gates) |
| |
| PIG-80: In a number of places stack trace information was being lost by an |
| exception being caught, and a different exception then thrown. All those |
| locations have been changed so that the new exception now wraps the old. |
| (francisoud via gates). |
| |
| PIG-84: Converted printStackTrace calls to calls to the logger. |
| (francisoud via gates). |
| |
| PIG-88: Remove unused HadoopExe import from Main. (pi_song via gates). |
| |
| PIG-99: Fix to make unit tests not run out of memory. (francisoud via |
| gates). |
| |
| PIG-107: enabled several tests. (francisoud via olgan) |
| |
| PIG-46: abort processing on error for non-interactive mode (olston via |
| olgan) |
| |
| PIG-109: improved exception handling (oae via olgan) |
| |
| PIG-72: Move unit tests to use MiniDFS and MiniMR so that unit tests can |
| be run w/o access to a hadoop cluster. (xuzh via gates) |
| |
| PIG-68: improvements to build.xml (joa23 via olgan) |
| |
| PIG-110: Replaced code accidently merged out in PIG-32 fix that handled |
| flattening the combiner case. (gates and oae) |
| |
| PIG-68 broke the build process by hardwiring hadoop15 jar for the purpose |
| of compile. Fixed that (olgan) |
| |
| PIG-124: only run one test (ant test -Dtestcase=TestMapReduce) not the |
| complete test suite (xuzh vi olgan) |
| |
| PIG-127: changes to build.xml to have description for each target |
| (francisoud via olgan) |
| |
| PIG-101: changes in tests to use enum type (francisoud via olgan) |
| |
| PIG-125: Improve exception handling in cases when an attempt is made to |
| access a field as a tuple, and it turns out not to be a tuple (oae via |
| gates). |
| |
| PIG-13: make the code use svn only if available (joa23 via olgan) |
| |
| PIG-118: make sure union/join/cross takes 2 params (pi_song vi olgan) |
| |
| PIG-94: M1 for streaming: maps and reduce side support with default |
| (de)serializer (acmurthy via olgan) |
| |
| PIG-129: making sure that temp files are stored in task's home dir and |
| cleaned up |
| |
| PIG-115: Removed Yahoo specific scripts/pig.pl, replaced with generic |
| bash script bin/pig. Moved startHOD.expect to bin (joa23 via gates). |
| |
| PIG-18: changes to make pig work with Hadoop 0.16 and HOD 0.4 (olgan) |
| |
| PIG-164: Fix memory issue in SpillableMemoryManager to partially clean the list of |
| bags each time a new bag is added rather than waiting until the garbage |
| collector tells us we are out of memory (gates). |
| |
| PIG-154: moving parsing for DEFINE and STORE into QueryParser |
| |
| PIG-100: improved error handling |
| |
| PIG-94: changes for M2 of streaming: input/ouptut/ ship/cache error |
| handling |
| |
| PIG-108: Fixed PigCombine to not do initialization on every call to |
| reduce, but instead only do it once in the call to configure. (joa23 via |
| gates). |
| |
| PIG-172: dealing with NULL error messages in exceptions (olgan) |
| |
| PIG-170: sort bags so that largest ones are released first |
| |
| PIG-122: Added build and src-gen to the list of ignore files in |
| the top level directory (joa23 via gates). |
| |
| PIG-94: M3 code update for streaming (arunc via olgan) |
| |
| PIG-179: Changed PigRecordReader to be a static singleton rather than |
| thread local. (gates). |
| |
| PIG-174,180: bug fixes in streaming (arunc via olgan) |
| |
| PIG-181: streaming bug fixing (arunc via olgan) |
| |
| PIG-182: streaming bug fix (arunc via olgan) |
| |
| PIG-184: streaming bug fixes |
| |
| PIG-153: Incorrect result caused by dump in between statements (pi_song |
| via gates). |
| |
| PIG-178: Use of schema on secondary output of SPLIT throws |
| IndexOutOfBoundsException (kali via gates). |
| |
| PIG-203: Fix bug in parameter substitution code where any pig script over |
| 1k caused pig to freeze. (kali via gates) |
| |
| PIG-204: Repair broken input splits (acmurthy via gates). |
| |
| PIG-188: Fix mismatches between pig slicer changes and new streaming |
| feature (acmurthy via gates). |
| |
| PIG-149, PIG-150: Fix doc target so that ant can generate docs (xuzh via |
| gates). |
| |
| PIG-183: Catch when a UDF has been compiled with the wrong version of |
| java and give a RuntimeException (pi_song via gates). |
| |
| PIG-114: store one alias/logicalPlan twice leads to instantiation of |
| StoreFunc as LoadFunc (pi_song via gates). |
| |
| PIG-213: Remove non-static references to logger from data bags and tuples, |
| as it causes significant overhead (vgeschel via gates). |
| |
| PIG-216: Fix streaming to work with commands that use unix pipes (acmurthy |
| via gates). |
| |
| PIG-207: Fix illustrate command to work in mapreduce mode (shubhamc via |
| gates). |
| |
| PIG-218: Fixed param generation to work with arbitrary commands |
| |
| PIG-220: Fixed definition of parameter name for param substitution |
| |
| PIG-151: fixes to code that handles bzip files |
| |
| PIG-222: fix build break |
| |
| PIG-226: fix for streaming optimization bug (acmurthy via olgan) |
| |
| PIG-228: make multiple streaming outputs adhere to spec (acmurthy via olgan) |
| |
| PIG-224: fix to error handling code to produce correct error code |
| |
| PIG-176: Change bag spilling so that bags below a certain threshold are |
| not spilled, thus avoiding proliferation of small files (pi_song via |
| gates). |
| |
| PIG-227: making load/store function optional in stream input/output spec |
| (acmurthy via olgan) |
| |
| PIG-215: Cleanup a few dangling ends left by PIG-111 (pi_song via gates). |
| |
| PIG-229: Proper error handling in case of deserializer failure |
| |
| PIG-230: Handling shipment for multiple ship/cache commands (acmurthy via |
| olgan) |
| |
| PIG-219: Change unit tests to run both local and map reduce modes |
| (kali via gates). |
| |
| PIG-202: Fix Order by so that user provided comparator func is used for |
| quantile determination (kali via gates). |
| |
| PIG-231: validation for ship, cache, and skippath (acmurthy via olgan) |
| |
| PIG-232: fix for number of output records when BinaryStirage is used |
| (acmurthy via olgan) |
| |
| PIG-232: fix for number of input records when BinaryStirage is used |
| (acmurthy via olgan) |
| |
| PIG-232: let valid cache specifications through (acmurthy via olgan) |
| |
| PIG-237: validation of the output directory (pi_song via olgan) |
| |
| PIG-236: Fix properties so that values specified via the |
| command line (-D) are not ignored (pkamath via gates). |
| |
| PIG-198: integration with hadoop 17 (acmurthy via olgan) |
| |
| PIG-85: allowing control characters as delimiters for PigStorage (pi_song |
| via olgan) |
| |
| PIG-250: disabling speculative execution (olgan) |
| |
| PIG-250: re-enabling speculative execution and fixing the failure |
| (acmurthy via olgan) |
| |
| PIG-85: memory optimization (pi_song via olgan) |
| |
| PIG-243: Fixing unit tests on windows (daijy via olgan) |
| |
| PIG-198: Fixed pig script to pick up hadoop 17 instead of 15 (pi_song via gates). |
| |
| PIG-266: fix warnings caused by HOD (olgan) |
| |
| PIG-245: added math functions to the piggybank (ajaygarg via olgan) |
| |
| PIG-255: make non-default constructors work for algebraic functions |
| (ajaygarg via olgan) |
| |
| PIG-272: problem with streaming and intermediate store (acmurthy via olgan) |
| |
| PIG-243: make unit tests work on windows (daijy via olgan) |
| |
| PIG-271: added tutorial to SVN (olgan) |
| |
| PIG-235: memory management improvements (pkamath via olgan) |
| |
| PIG-284: target for building source jar (oae via olgan) |
| |
| PIG-34: added missing licenses |
| |
| PIG-34: added LICENSE, NOTICE and README file |
| |
| PIG-291: hod.param parameters not passed properly (thatha via olgan) |
| |
| PIG-34: changes to build process to create distribution tar file |
| |
| PIG-34: updated CHANGES.txt |
| |
| PIG-472: Added RegExLoader to piggybank, an abstract loader class to parse |
| text files via regular espressions (spackest via gates) |
| |
| PIG-473: Added CommonLogLoader, a subclass of RegExLoader to piggybank (spackest via gates) |
| |
| PIG-474: Added MyRegexLoader, a subclass of RegExLoader, to piggybank (spackest via gates) |
| |