You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pig.apache.org by Apache Wiki <wi...@apache.org> on 2010/03/31 19:56:38 UTC

[Pig Wiki] Update of "FrontPage" by AlanGates

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The "FrontPage" page has been changed by AlanGates.
http://wiki.apache.org/pig/FrontPage?action=diff&rev1=146&rev2=147

--------------------------------------------------

  
  = Apache Pig Wiki =
  
- [[http://incubator.apache.org/pig/|Apache Pig]] is a platform for analyzing large data sets. Pig's language, Pig Latin, lets you specify a sequence of data transformations such as merging data sets, filtering them, and applying functions to records or groups of records. Pig comes with many built-in functions but you can also create your own user-defined functions to do special-purpose processing. 
+ [[http://hadoop.apache.org/pig/|Apache Pig]] is a platform for analyzing large data sets. Pig's language, Pig Latin, lets you specify a sequence of data transformations such as merging data sets, filtering them, and applying functions to records or groups of records. Pig comes with many built-in functions but you can also create your own user-defined functions to do special-purpose processing. 
  
  Pig Latin programs run in a distributed fashion on a cluster (programs are complied into Map/Reduce jobs and executed using Hadoop). For quick prototyping, Pig Latin programs can also run in "local mode" without a cluster (all processing takes place in a single local JVM).
  
@@ -20, +20 @@

  
  '''Why Pig Latin instead of SQL?'''  [[http://www.cs.cmu.edu/~olston/publications/sigmod08.pdf|Pig Latin: A Not-So-Foreign Language ...]]
  
- '''Pig Has Grown Up!'''. On 10/22/08 Pig graduated from the [[http://incubator.apache.org/|Incubator]] and joined [[http://hadoop.apache.org/|Apache Hadoop]] as a subproject.
- 
- '''Pig is Getting Faster!'''  2-6 times faster, for many queries.  We've created a set of benchmarks and run them against the pig 0.1.0 release (modified to run on hadoop 0.18) and against the current trunk (previously `types` branch.) Joins and order bys in particular made large performance gains. For complete details see PigMix.
- 
- '''Interested in Pig Guts?''' We are completely redesigning the Pig execution and optimization framework. For design details see PigOptimizationWishList and PigExecutionModel. 
- 
- '''Want to contribute but don't know where to kick in?''' Here is a [[http://wiki.apache.org/pig/ProposedProjects|list of project]] that we desired. We need new blood! 
+ '''Want to contribute but don't know where to kick in?''' Here is our [[http://wiki.apache.org/pig/PigJournal|journal]] of projects we have worked on, are working on,
+ and hope to work on.  Find a project that interests you and jump on in.
  
  '''Pig available as part of Amazon's Elastic !MapReduce''', as of August 2009.
  
@@ -40, +35 @@

   * [[http://hadoop.apache.org/pig/|User Documentation]]
   * [[http://www.cloudera.com/hadoop-training-pig-introduction|Online Pig Training]] - Complete with video lectures, exercises, and a pre-configured virtual machine. Developed by Cloudera and Yahoo!
   * PiggyBank - User-defined functions (UDFs) contributed by Pig users!
+  * PigTools - Tools Pig users have built around and on top of Pig.
+  * PigInteroperability - How to make Pig work with other platforms you may be using, such as HBase and Cassandra.
  
  == Developer Documentation ==
   * How tos