You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@pig.apache.org by Lauren Blau <la...@digitalreasoning.com> on 2013/04/04 15:25:10 UTC

simple script generating 'too many counters' error

I'm running a simple script to add a sequence_number to a relation, sort
the result and store to a file:

a0 = load '<filename>' using PigStorage('\t','-schema');
a1 = rank a0;
a2 = foreach a1 generate col1 .. col16 , rank_a0 as sequence_number;
a3 = order a2 by  sequence_number;
store a3 into 'outputfile' using PigStorage('\t','-schema');

I get the following error:
org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many
counters: 241 max=240
    at
org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:61)
    at
org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:68)
    at
org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:174)
    at org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:278)
    at
org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:303)
    at
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
    at
org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
    at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:951)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)


we aren't able to up our counters any higher (policy) and I don't
understand why I should need so many counters for such a simple script
anyway?
running Apache Pig version 0.11.1-SNAPSHOT (r: unknown)
compiled Mar 22 2013, 10:19:19

Can someone help?

Thanks,
Lauren

Re: simple script generating 'too many counters' error

Posted by Lauren Blau <la...@digitalreasoning.com>.

now that I've turned off noSplitCombination we have 640 mappers.
the relation being ranked is likely in the billions or 1+ trillion records.



On Fri, Apr 5, 2013 at 10:47 AM, Bill Graham <bi...@gmail.com> wrote:

> How many mappers and reducers do you have? Skimming the Rank code it looks
> like it creates at least N counters per task which would be a scalability
> bug.
>
> On Friday, April 5, 2013, Lauren Blau wrote:
>
> > this is defintely caused by the RANK operator. Is there some way to
> reduce
> > the number of counters generated by this operator when working with large
> > data?
> > thanks
> >
> > On Thu, Apr 4, 2013 at 7:01 PM, Lauren Blau <
> > lauren.blau@digitalreasoning.com <javascript:;>> wrote:
> >
> > > I can think of only 2 things that have changed since this script last
> ran
> > > successfully. Switched to using the range specification of the schema
> for
> > > a2, and the input data has grown considerably.
> > >
> > > Lauren
> > >
> > >
> > > On Thu, Apr 4, 2013 at 7:00 PM, Lauren Blau <
> > > lauren.blau@digitalreasoning.com <javascript:;>> wrote:
> > >
> > >> no
> > >>
> > >>
> > >> On Thu, Apr 4, 2013 at 4:54 PM, Dmitriy Ryaboy <dvryaboy@gmail.com
> <javascript:;>
> > >wrote:
> > >>
> > >>> Do you have any special properties set?
> > >>> Like the pig.udf.profile one maybe..
> > >>> D
> > >>>
> > >>>
> > >>> On Thu, Apr 4, 2013 at 6:25 AM, Lauren Blau <
> > >>> lauren.blau@digitalreasoning.com <javascript:;>> wrote:
> > >>>
> > >>> > I'm running a simple script to add a sequence_number to a relation,
> > >>> sort
> > >>> > the result and store to a file:
> > >>> >
> > >>> > a0 = load '<filename>' using PigStorage('\t','-schema');
> > >>> > a1 = rank a0;
> > >>> > a2 = foreach a1 generate col1 .. col16 , rank_a0 as
> sequence_number;
> > >>> > a3 = order a2 by  sequence_number;
> > >>> > store a3 into 'outputfile' using PigStorage('\t','-schema');
> > >>> >
> > >>> > I get the following error:
> > >>> > org.apache.hadoop.mapreduce.counters.LimitExceededException: Too
> many
> > >>> > counters: 241 max=240
> > >>> >     at
> > >>> >
> > >>>
> > org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:61)
> > >>> >     at
> > >>> >
> > >>>
> > org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:68)
> > >>> >     at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:174)
> > >>> >     at
> > >>> >
> org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:278)
> > >>> >     at
> > >>> >
> > >>> >
> > >>>
> >
> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:303)
> > >>> >     at
> > >>> >
> > org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
> > >>> >     at
> > >>> >
> > org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
> > >>> >     at
> > >>> >
> > >>>
> > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:951)
> > >>> >     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)
> > >>> >
> > >>> >
> > >>> > we aren't able to up our counters any higher (policy) and I don't
> > >>> > understand why I should need so many counters for such a simple
> > script
> > >>> > anyway?
> > >>> > running Apache Pig version 0.11.1-SNAPSHOT (r: unknown)
> > >>> > compiled Mar 22 2013, 10:19:19
> > >>> >
> > >>> > Can someone help?
> > >>> >
> > >>> > Thanks,
> > >>> > Lauren
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>
>
> --
> Sent from Gmail Mobile
>

Re: simple script generating 'too many counters' error

Posted by Bill Graham <bi...@gmail.com>.

How many mappers and reducers do you have? Skimming the Rank code it looks
like it creates at least N counters per task which would be a scalability
bug.

On Friday, April 5, 2013, Lauren Blau wrote:

> this is defintely caused by the RANK operator. Is there some way to reduce
> the number of counters generated by this operator when working with large
> data?
> thanks
>
> On Thu, Apr 4, 2013 at 7:01 PM, Lauren Blau <
> lauren.blau@digitalreasoning.com <javascript:;>> wrote:
>
> > I can think of only 2 things that have changed since this script last ran
> > successfully. Switched to using the range specification of the schema for
> > a2, and the input data has grown considerably.
> >
> > Lauren
> >
> >
> > On Thu, Apr 4, 2013 at 7:00 PM, Lauren Blau <
> > lauren.blau@digitalreasoning.com <javascript:;>> wrote:
> >
> >> no
> >>
> >>
> >> On Thu, Apr 4, 2013 at 4:54 PM, Dmitriy Ryaboy <dvryaboy@gmail.com<javascript:;>
> >wrote:
> >>
> >>> Do you have any special properties set?
> >>> Like the pig.udf.profile one maybe..
> >>> D
> >>>
> >>>
> >>> On Thu, Apr 4, 2013 at 6:25 AM, Lauren Blau <
> >>> lauren.blau@digitalreasoning.com <javascript:;>> wrote:
> >>>
> >>> > I'm running a simple script to add a sequence_number to a relation,
> >>> sort
> >>> > the result and store to a file:
> >>> >
> >>> > a0 = load '<filename>' using PigStorage('\t','-schema');
> >>> > a1 = rank a0;
> >>> > a2 = foreach a1 generate col1 .. col16 , rank_a0 as sequence_number;
> >>> > a3 = order a2 by  sequence_number;
> >>> > store a3 into 'outputfile' using PigStorage('\t','-schema');
> >>> >
> >>> > I get the following error:
> >>> > org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many
> >>> > counters: 241 max=240
> >>> >     at
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:61)
> >>> >     at
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:68)
> >>> >     at
> >>> >
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:174)
> >>> >     at
> >>> > org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:278)
> >>> >     at
> >>> >
> >>> >
> >>>
> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:303)
> >>> >     at
> >>> >
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
> >>> >     at
> >>> >
> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
> >>> >     at
> >>> >
> >>>
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:951)
> >>> >     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)
> >>> >
> >>> >
> >>> > we aren't able to up our counters any higher (policy) and I don't
> >>> > understand why I should need so many counters for such a simple
> script
> >>> > anyway?
> >>> > running Apache Pig version 0.11.1-SNAPSHOT (r: unknown)
> >>> > compiled Mar 22 2013, 10:19:19
> >>> >
> >>> > Can someone help?
> >>> >
> >>> > Thanks,
> >>> > Lauren
> >>> >
> >>>
> >>
> >>
> >
>


-- 
Sent from Gmail Mobile

Re: simple script generating 'too many counters' error

Posted by Lauren Blau <la...@digitalreasoning.com>.

this is defintely caused by the RANK operator. Is there some way to reduce
the number of counters generated by this operator when working with large
data?
thanks

On Thu, Apr 4, 2013 at 7:01 PM, Lauren Blau <
lauren.blau@digitalreasoning.com> wrote:

> I can think of only 2 things that have changed since this script last ran
> successfully. Switched to using the range specification of the schema for
> a2, and the input data has grown considerably.
>
> Lauren
>
>
> On Thu, Apr 4, 2013 at 7:00 PM, Lauren Blau <
> lauren.blau@digitalreasoning.com> wrote:
>
>> no
>>
>>
>> On Thu, Apr 4, 2013 at 4:54 PM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>>
>>> Do you have any special properties set?
>>> Like the pig.udf.profile one maybe..
>>> D
>>>
>>>
>>> On Thu, Apr 4, 2013 at 6:25 AM, Lauren Blau <
>>> lauren.blau@digitalreasoning.com> wrote:
>>>
>>> > I'm running a simple script to add a sequence_number to a relation,
>>> sort
>>> > the result and store to a file:
>>> >
>>> > a0 = load '<filename>' using PigStorage('\t','-schema');
>>> > a1 = rank a0;
>>> > a2 = foreach a1 generate col1 .. col16 , rank_a0 as sequence_number;
>>> > a3 = order a2 by  sequence_number;
>>> > store a3 into 'outputfile' using PigStorage('\t','-schema');
>>> >
>>> > I get the following error:
>>> > org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many
>>> > counters: 241 max=240
>>> >     at
>>> >
>>> org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:61)
>>> >     at
>>> >
>>> org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:68)
>>> >     at
>>> >
>>> >
>>> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:174)
>>> >     at
>>> > org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:278)
>>> >     at
>>> >
>>> >
>>> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:303)
>>> >     at
>>> > org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
>>> >     at
>>> > org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
>>> >     at
>>> >
>>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:951)
>>> >     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)
>>> >
>>> >
>>> > we aren't able to up our counters any higher (policy) and I don't
>>> > understand why I should need so many counters for such a simple script
>>> > anyway?
>>> > running Apache Pig version 0.11.1-SNAPSHOT (r: unknown)
>>> > compiled Mar 22 2013, 10:19:19
>>> >
>>> > Can someone help?
>>> >
>>> > Thanks,
>>> > Lauren
>>> >
>>>
>>
>>
>

Re: simple script generating 'too many counters' error

Posted by Lauren Blau <la...@digitalreasoning.com>.

I can think of only 2 things that have changed since this script last ran
successfully. Switched to using the range specification of the schema for
a2, and the input data has grown considerably.

Lauren

On Thu, Apr 4, 2013 at 7:00 PM, Lauren Blau <
lauren.blau@digitalreasoning.com> wrote:

> no
>
>
> On Thu, Apr 4, 2013 at 4:54 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>
>> Do you have any special properties set?
>> Like the pig.udf.profile one maybe..
>> D
>>
>>
>> On Thu, Apr 4, 2013 at 6:25 AM, Lauren Blau <
>> lauren.blau@digitalreasoning.com> wrote:
>>
>> > I'm running a simple script to add a sequence_number to a relation, sort
>> > the result and store to a file:
>> >
>> > a0 = load '<filename>' using PigStorage('\t','-schema');
>> > a1 = rank a0;
>> > a2 = foreach a1 generate col1 .. col16 , rank_a0 as sequence_number;
>> > a3 = order a2 by  sequence_number;
>> > store a3 into 'outputfile' using PigStorage('\t','-schema');
>> >
>> > I get the following error:
>> > org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many
>> > counters: 241 max=240
>> >     at
>> >
>> org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:61)
>> >     at
>> > org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:68)
>> >     at
>> >
>> >
>> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:174)
>> >     at
>> > org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:278)
>> >     at
>> >
>> >
>> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:303)
>> >     at
>> > org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
>> >     at
>> > org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
>> >     at
>> > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:951)
>> >     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)
>> >
>> >
>> > we aren't able to up our counters any higher (policy) and I don't
>> > understand why I should need so many counters for such a simple script
>> > anyway?
>> > running Apache Pig version 0.11.1-SNAPSHOT (r: unknown)
>> > compiled Mar 22 2013, 10:19:19
>> >
>> > Can someone help?
>> >
>> > Thanks,
>> > Lauren
>> >
>>
>
>

Re: simple script generating 'too many counters' error

Posted by Lauren Blau <la...@digitalreasoning.com>.

no

On Thu, Apr 4, 2013 at 4:54 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Do you have any special properties set?
> Like the pig.udf.profile one maybe..
> D
>
>
> On Thu, Apr 4, 2013 at 6:25 AM, Lauren Blau <
> lauren.blau@digitalreasoning.com> wrote:
>
> > I'm running a simple script to add a sequence_number to a relation, sort
> > the result and store to a file:
> >
> > a0 = load '<filename>' using PigStorage('\t','-schema');
> > a1 = rank a0;
> > a2 = foreach a1 generate col1 .. col16 , rank_a0 as sequence_number;
> > a3 = order a2 by  sequence_number;
> > store a3 into 'outputfile' using PigStorage('\t','-schema');
> >
> > I get the following error:
> > org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many
> > counters: 241 max=240
> >     at
> > org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:61)
> >     at
> > org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:68)
> >     at
> >
> >
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:174)
> >     at
> > org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:278)
> >     at
> >
> >
> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:303)
> >     at
> > org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
> >     at
> > org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
> >     at
> > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:951)
> >     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)
> >
> >
> > we aren't able to up our counters any higher (policy) and I don't
> > understand why I should need so many counters for such a simple script
> > anyway?
> > running Apache Pig version 0.11.1-SNAPSHOT (r: unknown)
> > compiled Mar 22 2013, 10:19:19
> >
> > Can someone help?
> >
> > Thanks,
> > Lauren
> >
>

Re: simple script generating 'too many counters' error

Posted by Dmitriy Ryaboy <dv...@gmail.com>.

Do you have any special properties set?
Like the pig.udf.profile one maybe..
D


On Thu, Apr 4, 2013 at 6:25 AM, Lauren Blau <
lauren.blau@digitalreasoning.com> wrote:

> I'm running a simple script to add a sequence_number to a relation, sort
> the result and store to a file:
>
> a0 = load '<filename>' using PigStorage('\t','-schema');
> a1 = rank a0;
> a2 = foreach a1 generate col1 .. col16 , rank_a0 as sequence_number;
> a3 = order a2 by  sequence_number;
> store a3 into 'outputfile' using PigStorage('\t','-schema');
>
> I get the following error:
> org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many
> counters: 241 max=240
>     at
> org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:61)
>     at
> org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:68)
>     at
>
> org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:174)
>     at
> org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:278)
>     at
>
> org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:303)
>     at
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
>     at
> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:951)
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835)
>
>
> we aren't able to up our counters any higher (policy) and I don't
> understand why I should need so many counters for such a simple script
> anyway?
> running Apache Pig version 0.11.1-SNAPSHOT (r: unknown)
> compiled Mar 22 2013, 10:19:19
>
> Can someone help?
>
> Thanks,
> Lauren
>