You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pig.apache.org by Apache Wiki <wi...@apache.org> on 2010/03/31 19:56:38 UTC
[Pig Wiki] Update of "FrontPage" by AlanGates
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.
The "FrontPage" page has been changed by AlanGates.
http://wiki.apache.org/pig/FrontPage?action=diff&rev1=146&rev2=147
--------------------------------------------------
= Apache Pig Wiki =
- [[http://incubator.apache.org/pig/|Apache Pig]] is a platform for analyzing large data sets. Pig's language, Pig Latin, lets you specify a sequence of data transformations such as merging data sets, filtering them, and applying functions to records or groups of records. Pig comes with many built-in functions but you can also create your own user-defined functions to do special-purpose processing.
+ [[http://hadoop.apache.org/pig/|Apache Pig]] is a platform for analyzing large data sets. Pig's language, Pig Latin, lets you specify a sequence of data transformations such as merging data sets, filtering them, and applying functions to records or groups of records. Pig comes with many built-in functions but you can also create your own user-defined functions to do special-purpose processing.
Pig Latin programs run in a distributed fashion on a cluster (programs are complied into Map/Reduce jobs and executed using Hadoop). For quick prototyping, Pig Latin programs can also run in "local mode" without a cluster (all processing takes place in a single local JVM).
@@ -20, +20 @@
'''Why Pig Latin instead of SQL?''' [[http://www.cs.cmu.edu/~olston/publications/sigmod08.pdf|Pig Latin: A Not-So-Foreign Language ...]]
- '''Pig Has Grown Up!'''. On 10/22/08 Pig graduated from the [[http://incubator.apache.org/|Incubator]] and joined [[http://hadoop.apache.org/|Apache Hadoop]] as a subproject.
-
- '''Pig is Getting Faster!''' 2-6 times faster, for many queries. We've created a set of benchmarks and run them against the pig 0.1.0 release (modified to run on hadoop 0.18) and against the current trunk (previously `types` branch.) Joins and order bys in particular made large performance gains. For complete details see PigMix.
-
- '''Interested in Pig Guts?''' We are completely redesigning the Pig execution and optimization framework. For design details see PigOptimizationWishList and PigExecutionModel.
-
- '''Want to contribute but don't know where to kick in?''' Here is a [[http://wiki.apache.org/pig/ProposedProjects|list of project]] that we desired. We need new blood!
+ '''Want to contribute but don't know where to kick in?''' Here is our [[http://wiki.apache.org/pig/PigJournal|journal]] of projects we have worked on, are working on,
+ and hope to work on. Find a project that interests you and jump on in.
'''Pig available as part of Amazon's Elastic !MapReduce''', as of August 2009.
@@ -40, +35 @@
* [[http://hadoop.apache.org/pig/|User Documentation]]
* [[http://www.cloudera.com/hadoop-training-pig-introduction|Online Pig Training]] - Complete with video lectures, exercises, and a pre-configured virtual machine. Developed by Cloudera and Yahoo!
* PiggyBank - User-defined functions (UDFs) contributed by Pig users!
+ * PigTools - Tools Pig users have built around and on top of Pig.
+ * PigInteroperability - How to make Pig work with other platforms you may be using, such as HBase and Cassandra.
== Developer Documentation ==
* How tos