You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@pig.apache.org by Dmitriy Ryaboy <dv...@gmail.com> on 2013/01/11 04:37:43 UTC

Re: Sequence File processing

Please see the list of editor plugins in
https://cwiki.apache.org/confluence/display/PIG/PigTools

D


On Mon, Dec 24, 2012 at 9:42 PM, Kshiva Kps <ks...@gmail.com> wrote:

> Hi,
>
> Is there any PIG editors and where we can write 100 to 150 pig scripts
> I'm believing is not possible to  do in CLI mode .
> Like IDE for JAVA /TOAD for SQL pls advice , many thanks
>
>
> Thanks
>
>
> On Tue, Dec 25, 2012 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
>
> > +1
> >
> > Best Regards,
> > Tariq
> > +91-9741563634
> > https://mtariq.jux.com/
> >
> >
> > On Tue, Dec 25, 2012 at 3:07 AM, Cheolsoo Park <cheolsoo@cloudera.com
> > >wrote:
> >
> > > Hi Srini,
> > >
> > > You can use STRSPLIT to split your "value" chararray and define schema
> > in a
> > > FOREACH. For example, if the "value" consists of 3 integers (i.e.
> > "1|2|3"),
> > >
> > > A= LOAD 'part-m-0000' USING SequenceFileLoader() AS
> > > (key:long,value:chararray);
> > > B = FOREACH A GENERATE key, FLATTEN( STRSPLIT(value,'\\|') ) AS (i:int,
> > > j:int, k:int);
> > > DESCRIBE B;
> > > DUMP B;
> > >
> > > This will return:
> > >
> > > B: {key: chararray,i: int,j: int,k: int}
> > > (k,1,2,3)
> > >
> > > Thanks,
> > > Cheolsoo
> > >
> > >
> > > On Sun, Dec 23, 2012 at 9:24 PM, Srini <pi...@gmail.com> wrote:
> > >
> > > > Hi ,
> > > >
> > > > I have used SequeceFileLoader for loading sequence file.
> > > >
> > > > A= load 'part-m-0000' using SequenceFileLoader() as
> > > > (key:long,value:chararray)
> > > >
> > > > "value" is the  chararray which consists of 10 fields which are
> > separated
> > > > by delimiter ( "|" here ). How do I create schema here so that I can
> > make
> > > > further analysis with these fields (such as filter, group )
> > > >
> > > > Any help is appreciated.
> > > >
> > > > Thanks,
> > > > Srini
> > > >
> > >
> >
>