You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by deqiang sun <de...@bcm.edu> on 2010/03/01 15:25:48 UTC

How is the data passed to external script via STREAM

> in PIG?
> 
> A = LOAD 'a.log'
> B= STREAM A through 'myscript.pl script_parameters'
> 
> Is A converted to an array and attached to the @ARGV?
Thanks,


Deqiang


Re: How is the data passed to external script via STREAM

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
It's streamed in using an implementation of a StoreFunc (Tab-delimited text,
by default) and read out using a LoadFunc (same, by default). You can
specify your own Load and Store funcs for serialization.  In the 0.7
redesign, you implement a PigToStream and StreamToPig interface instead of
using a LoadFunc.

If you check out the LoadStoreRedesign proposal, it discusses both how
Streaming is done in Pig through 0.6, and how it will be done in 0.7:
http://wiki.apache.org/pig/LoadStoreRedesignProposal

-D

On Mon, Mar 1, 2010 at 6:25 AM, deqiang sun <de...@bcm.edu> wrote:

> > in PIG?
> >
> > A = LOAD 'a.log'
> > B= STREAM A through 'myscript.pl script_parameters'
> >
> > Is A converted to an array and attached to the @ARGV?
> Thanks,
>
>
> Deqiang
>
>