Title: Pig Incubation Status
Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
Pig has graduated as a Hadoop subproject , see Message-ID: <f767f0600810171144w3aaff127p65766a0a2a1a9662@mail.gmail.com> on general@incubator.apache.org
The Pig project graduated on 2008-10-22
Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
At the present time, Pig‘s infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig’s language layer currently consists of a textual language called Pig Latin, which has the following key properties:
Ease of programming. It is trivial to achieve parallel execution of simple, “embarrassingly parallel” data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain.
Optimization opportunities. The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency.
Extensibility. Users can create their own functions to do special-purpose processing.
October 22, 2008: Pig has graduated as a Hadoop subproject.
September 2008: Daniel Dai voted in as a Pig committer
September 2008: Pig 0.1.0 released
June 2008: Pi Song voted in as a Pig committer
October 2007: Pig formally accepted into Apache Incubator
This is the first phase on incubation, needed to start the project at Apache.
Item assignment is shown by the Apache id. Completed tasks are shown by the completion date (YYYY-MM-dd).
date | item |
---|---|
October 2007 | Make sure that the requested project name does not already exist and check www.nameprotect.com to be sure that the name is not already trademarked for an existing software product. |
Existing project (Lucene) pre-approved Fall 2007 | If request from outside Apache to enter an existing Apache project, then post a message to that project for them to decide on acceptance. |
DONE
date | item |
---|---|
October 2007 | Identify all the Mentors for the incubation, by asking all that can be Mentors. |
November 2007 | Subscribe all Mentors on the pmc and general lists. |
November 2007 | Give all Mentors access to the incubator SVN repository. (to be done by the Incubator PMC chair or an Incubator PMC Member wih karma for the authorizations file) |
November 2007 (re-done June 2008) | Tell Mentors to track progress in the file ‘incubator/projects/{project.name}.xml’ |
DONE
date | item |
---|---|
Not applicable (I think) | Check and make sure that the papers that transfer rights to the ASF been received. It is only necessary to transfer rights for the package, the core code, and any new code produced by the project. |
Not applicble (new project) | Check and make sure that the files that have been donated have been updated to reflect the new ASF copyright. |
date | item |
---|---|
August 2008 | Check and make sure that for all code included with the distribution that is not under the Apache license, e have the right to combine with Apache-licensed code and redistribute. |
August 2008 | Check and make sure that all source code distributed by the project is covered by one or more of the following approved licenses: Apache, BSD, Artistic, MIT/X, MIT/W3C, MPL 1.1, or something with essentially the same terms. |
DONE
date | item |
---|---|
October 2007 (and ongoing) | Check that all active committers have submitted a contributors agreement. |
September 2008 | Add all active committers in the STATUS file. |
September 2008 | Ask root for the creation of committers' accounts on people.apache.org. |
DONE
Active Committers Are
Committer | Email Address |
---|---|
Daniel Dai | daijyc@gmail.com |
Nigel Daley | nigel@apache.org |
Alan Gates | gates@yahoo-inc.com |
Olga Natkovich | olgan@yahoo-inc.com |
Owen O'Malley | omalley@apache.org |
Chris Olston | olston@yahoo-inc.com |
Ben Reed | breed@yahoo-inc.com |
Pi Song | pi.songs@gmail.com |
Utkarsh Srivastava | utkarsh@yahoo-inc.com |
date | item |
---|---|
October 2007 | Ask infrastructure to create source repository modules and grant the committers karma. |
October 2007 | Ask infrastructure to set up and archive Mailing lists. |
October 2007 | Decide about and then ask infrastructure to setup an issuetracking system (Bugzilla, Scarab, Jira). |
Not applicable (project starts in Apache) | Migrate the project to our infrastructure. |
DONE
Add project specific tasks here.
These action items have to be checked for during the whole incubation process.
These items are not to be signed as done during incubation, as they may change during incubation. They are to be looked into and described in the status reports and completed in the request for incubation signoff.
Have all of the active long-term volunteers been identified and acknowledged as committers on the project? Yes
Are there three or more independent committers? (The legal definition of independent is long and boring, but basically it means that there is no binding relationship between the individuals, such as a shared employer, that is capable of overriding their free will as individuals, directly or indirectly.) Yes
Are project decisions being made in public by the committers? Yes
Are the decision-making guidelines published and agreed to by all of the committers? Yes
Add project specific tasks here.
Things to check for before voting the project out.
If graduating to an existing PMC, has the PMC voted to accept it?
If graduating to a new PMC, has the board voted to accept it?