You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pig.apache.org by Apache Wiki <wi...@apache.org> on 2009/11/18 04:30:11 UTC

[Pig Wiki] Trivial Update of "PigStreamingFunctionalSpec" by MarcioSilva

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The "PigStreamingFunctionalSpec" page has been changed by MarcioSilva.
The comment on this change is: correcting what appears to be a typo..
http://wiki.apache.org/pig/PigStreamingFunctionalSpec?action=diff&rev1=47&rev2=48

--------------------------------------------------

  Streaming can have three separate meaning in the context of Pig project:
  
   1. A specific way of submitting jobs to Hadoop: Hadoop Streaming
-  2. A form of processing in which the entire portion of the dataset that corresponds to a task in sent to the task and output streams out. There is no temporal or causal correspondence between an input record and specific output records.
+  2. A form of processing in which the entire portion of the dataset that corresponds to a task is sent to the task and output streams out. There is no temporal or causal correspondence between an input record and specific output records.
   3. The use of non-Java functions with Pig.
  
  The goal of Pig with respect to streaming is to support #2 for (a)Java UDFs, (b)non-Java UDFs and (c)user specified binaries/scripts. We will start with (c) since it would be most beneficial for the users. It is not our goal to be feature-by-feature compatible with Hadoop streaming as it is too open-ended and might force us to implement features that we don't necessarily want in Pig.