You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Suhas Satish <su...@gmail.com> on 2014/03/13 01:36:46 UTC

rewrite equivalent pig script

The following pig script hangs due to a bug. Is there a different way to
rewrite it and achieve the same functionality? Any ideas to do things
differently are appreciated.

tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int,
DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);

gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;

pWeek = FILTER gTWeek BY (PERIOD == 201312);

pWeekRanked = RANK pWeek BY WEEK ASC DENSE;

gpWeekRanked = FOREACH pWeekRanked GENERATE $0;

store gpWeekRanked INTO 'gpWeekRanked2';

describe gpWeekRanked2;



Thanks,
Suhas.

Re: rewrite equivalent pig script

Posted by Suhas Satish <su...@gmail.com>.
the last line should be
describe gpWeekRanked;

It was a typo . Its a thread hang with the following stack trace -


stack trace:
-----------

"main" prio=10 tid=0x00007fd74800b000 nid=0x2f63 runnable [0x00007fd750d50000]
   java.lang.Thread.State: RUNNABLE
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:217)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:680)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:346)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:282)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1117)
    at org.apache.hadoop.mapred.Child.main(Child.java:271)


-----------------------------------------

org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapBase.java:

protected void runPipeline(PhysicalOperator leaf) throws IOException,
InterruptedException {
        while(true){
            Result res = leaf.getNext(DUMMYTUPLE);
            if(res.returnStatus==POStatus.STATUS_OK){
                collect(outputCollector,(Tuple)res.result);
                continue;
            }


Cheers,
Suhas.


On Thu, Mar 13, 2014 at 10:57 PM, Ronald Green <gr...@gmail.com>wrote:

> What bug? Do you get an exception?
>
> It seems like you're trying to describe an alias that doesn't exist in your
> script. 'gpWeekRanked2' in store gpWeekRanked INTO 'gpWeekRanked2'; is
> actually a path (usually in HDFS) you store the data into. You can't
> describe it.
>
>
> On 13 March 2014 02:36, Suhas Satish <su...@gmail.com> wrote:
>
> > The following pig script hangs due to a bug. Is there a different way to
> > rewrite it and achieve the same functionality? Any ideas to do things
> > differently are appreciated.
> >
> > tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int,
> > DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);
> >
> > gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;
> >
> > pWeek = FILTER gTWeek BY (PERIOD == 201312);
> >
> > pWeekRanked = RANK pWeek BY WEEK ASC DENSE;
> >
> > gpWeekRanked = FOREACH pWeekRanked GENERATE $0;
> >
> > store gpWeekRanked INTO 'gpWeekRanked2';
> >
> > describe gpWeekRanked2;
> >
> >
> >
> > Thanks,
> > Suhas.
> >
>

Re: rewrite equivalent pig script

Posted by Ronald Green <gr...@gmail.com>.
What bug? Do you get an exception?

It seems like you're trying to describe an alias that doesn't exist in your
script. 'gpWeekRanked2' in store gpWeekRanked INTO 'gpWeekRanked2'; is
actually a path (usually in HDFS) you store the data into. You can't
describe it.


On 13 March 2014 02:36, Suhas Satish <su...@gmail.com> wrote:

> The following pig script hangs due to a bug. Is there a different way to
> rewrite it and achieve the same functionality? Any ideas to do things
> differently are appreciated.
>
> tWeek = LOAD '/tmp/test_data.txt' USING PigStorage ('|') AS (WEEK:int,
> DESCRIPTION:chararray, END_DATE:chararray, PERIOD:int);
>
> gTWeek = FOREACH tWeek GENERATE WEEK AS WEEK, PERIOD AS PERIOD;
>
> pWeek = FILTER gTWeek BY (PERIOD == 201312);
>
> pWeekRanked = RANK pWeek BY WEEK ASC DENSE;
>
> gpWeekRanked = FOREACH pWeekRanked GENERATE $0;
>
> store gpWeekRanked INTO 'gpWeekRanked2';
>
> describe gpWeekRanked2;
>
>
>
> Thanks,
> Suhas.
>