You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Alan Gates <ga...@hortonworks.com> on 2011/09/30 15:17:58 UTC

Re: Do we need to load the file multiple times.

No, Pig can split the data stream.  Operations like:

A = load 'foo';
B = filter A by ...
C = filter A by ...
D = group A by ...

are supported.

Alan.

On Sep 30, 2011, at 5:35 AM, kiranprasad wrote:

> Do we need to load the file multiple times if we want to perform actions again and again on the same file.
> 
> Regards
> Kiran.G


Re: Do we need to load the file multiple times.

Posted by Jonathan Coveney <jc...@gmail.com>.
as far as I know, split is just syntactic sugar on top of the filter, so
either should work. Either way, pig is optimized under the hood not to do
multiple filters in multiple passes, iirc.

2011/9/30 Mahantesh Mahalmani <ma...@cbsinteractive.com>

> I would recommend using SPLIT command for filtering out teh same dataset
> based on multiple criteria.
>
> Thanks
> Monty
>
> On Fri, Sep 30, 2011 at 9:17 AM, Alan Gates <ga...@hortonworks.com> wrote:
>
> > No, Pig can split the data stream.  Operations like:
> >
> > A = load 'foo';
> > B = filter A by ...
> > C = filter A by ...
> > D = group A by ...
> >
> > are supported.
> >
> > Alan.
> >
> > On Sep 30, 2011, at 5:35 AM, kiranprasad wrote:
> >
> > > Do we need to load the file multiple times if we want to perform
> actions
> > again and again on the same file.
> > >
> > > Regards
> > > Kiran.G
> >
> >
>

Re: Do we need to load the file multiple times.

Posted by Mahantesh Mahalmani <ma...@cbsinteractive.com>.
I would recommend using SPLIT command for filtering out teh same dataset
based on multiple criteria.

Thanks
Monty

On Fri, Sep 30, 2011 at 9:17 AM, Alan Gates <ga...@hortonworks.com> wrote:

> No, Pig can split the data stream.  Operations like:
>
> A = load 'foo';
> B = filter A by ...
> C = filter A by ...
> D = group A by ...
>
> are supported.
>
> Alan.
>
> On Sep 30, 2011, at 5:35 AM, kiranprasad wrote:
>
> > Do we need to load the file multiple times if we want to perform actions
> again and again on the same file.
> >
> > Regards
> > Kiran.G
>
>