You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Haitao Yao <ya...@gmail.com> on 2012/08/20 03:49:00 UTC

Pig OutOfMemory Error on org.apache.hadoop.mapreduce.Reducer$Context

Hi, all, 
	I got an OOME , on org.apache.hadoop.mapreduce.Reducer$Context, here's the snapshot of the heap dump:


Well, does pig have to report so many data through  the Reducer$Context? 
Can this be closed?

thanks.



Haitao Yao
yao.erix@gmail.com
weibo: @haitao_yao
Skype:  haitao.yao.final


Re: Pig OutOfMemory Error on org.apache.hadoop.mapreduce.Reducer$Context

Posted by Jonathan Coveney <jc...@gmail.com>.
What's your script?

2012/8/19 Haitao Yao <ya...@gmail.com>

> Hi, all,
> I got an OOME , on org.apache.hadoop.mapreduce.Reducer$Context, here's the
> snapshot of the heap dump:
>
> Well, does pig have to report so many data through  the Reducer$Context?
> Can this be closed?
>
> thanks.
>
>
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
>

Re: Pig OutOfMemory Error on org.apache.hadoop.mapreduce.Reducer$Context

Posted by Haitao Yao <ya...@gmail.com>.
Well , the OOM issue about Pig Bag has been fixed by me.  I've push it into production and no OOM about Bag any more. I've contributed a patch here: https://issues.apache.org/jira/browse/PIG-2812 . Hope this can draw someone's attention and merge it into trunk.


Haitao Yao
yao.erix@gmail.com
weibo: @haitao_yao
Skype:  haitao.yao.final

在 2012-8-22,上午6:05, Jonathan Coveney 写道:

> Make sure that any Pig process has at least 1gb. I've seen bag-related
> OOM's when people don't give enough heap to Pig.
> 
> 2012/8/21 Haitao Yao <ya...@gmail.com>
> 
>> Thank you very much for this.
>> But I still can not find which snippet of the script caused this OOM. The
>> heap dump is generated in the midnight, nobody is standing by.
>> 
>> I 've added some scripts to capture more information about the job. If
>> succeed,  I 'll share with you tomorrow.
>> Thanks again.
>> 
>> 
>> 
>> Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>> 
>> 在 2012-8-21,下午5:22, Subir S 写道:
>> 
>>> I think it helps to get some context, if you can spin a small snippet
>> from your script which causes this issue. This may give a better
>> understanding of what you are trying to achieve, and if there is some work
>> around. However this is upto you!
>>> 
>>> Thanks, Subir
>>> 
>>> On Tue, Aug 21, 2012 at 2:12 PM, Haitao Yao <ya...@gmail.com> wrote:
>>> I 'v found the reason: org.apache.hadoop.io.DataInputBuffer , there's a
>> big byte array referenced by DataInputBuffer.
>>> 
>>> But there's no any way to close the buffer.
>>> 
>>> Is  there any other solution?
>>> 
>>> BTW, my script is more than 500 lines, sorry i can not show you.
>>> 
>>> here's the screenshot:
>>> 
>>> <aa.jpg>
>>> 
>>> 
>>> 
>>> 
>>> <bb.jpg>
>>> 
>>> 
>>> 
>>> Haitao Yao
>>> yao.erix@gmail.com
>>> weibo: @haitao_yao
>>> Skype:  haitao.yao.final
>>> 
>>> 在 2012-8-20,上午9:49, Haitao Yao 写道:
>>> 
>>>> Hi, all,
>>>>     I got an OOME , on org.apache.hadoop.mapreduce.Reducer$Context,
>> here's the snapshot of the heap dump:
>>>> <aa.jpg>
>>>> 
>>>> Well, does pig have to report so many data through  the Reducer$Context?
>>>> Can this be closed?
>>>> 
>>>> thanks.
>>>> 
>>>> 
>>>> 
>>>> Haitao Yao
>>>> yao.erix@gmail.com
>>>> weibo: @haitao_yao
>>>> Skype:  haitao.yao.final
>>>> 
>>> 
>>> 
>> 
>> 


Re: Pig OutOfMemory Error on org.apache.hadoop.mapreduce.Reducer$Context

Posted by Jonathan Coveney <jc...@gmail.com>.
Make sure that any Pig process has at least 1gb. I've seen bag-related
OOM's when people don't give enough heap to Pig.

2012/8/21 Haitao Yao <ya...@gmail.com>

> Thank you very much for this.
> But I still can not find which snippet of the script caused this OOM. The
> heap dump is generated in the midnight, nobody is standing by.
>
> I 've added some scripts to capture more information about the job. If
> succeed,  I 'll share with you tomorrow.
> Thanks again.
>
>
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
> 在 2012-8-21,下午5:22, Subir S 写道:
>
> > I think it helps to get some context, if you can spin a small snippet
> from your script which causes this issue. This may give a better
> understanding of what you are trying to achieve, and if there is some work
> around. However this is upto you!
> >
> > Thanks, Subir
> >
> > On Tue, Aug 21, 2012 at 2:12 PM, Haitao Yao <ya...@gmail.com> wrote:
> > I 'v found the reason: org.apache.hadoop.io.DataInputBuffer , there's a
> big byte array referenced by DataInputBuffer.
> >
> > But there's no any way to close the buffer.
> >
> > Is  there any other solution?
> >
> > BTW, my script is more than 500 lines, sorry i can not show you.
> >
> > here's the screenshot:
> >
> > <aa.jpg>
> >
> >
> >
> >
> > <bb.jpg>
> >
> >
> >
> > Haitao Yao
> > yao.erix@gmail.com
> > weibo: @haitao_yao
> > Skype:  haitao.yao.final
> >
> > 在 2012-8-20,上午9:49, Haitao Yao 写道:
> >
> >> Hi, all,
> >>      I got an OOME , on org.apache.hadoop.mapreduce.Reducer$Context,
> here's the snapshot of the heap dump:
> >> <aa.jpg>
> >>
> >> Well, does pig have to report so many data through  the Reducer$Context?
> >> Can this be closed?
> >>
> >> thanks.
> >>
> >>
> >>
> >> Haitao Yao
> >> yao.erix@gmail.com
> >> weibo: @haitao_yao
> >> Skype:  haitao.yao.final
> >>
> >
> >
>
>

Re: Pig OutOfMemory Error on org.apache.hadoop.mapreduce.Reducer$Context

Posted by Haitao Yao <ya...@gmail.com>.
Thank you very much for this. 
But I still can not find which snippet of the script caused this OOM. The heap dump is generated in the midnight, nobody is standing by. 

I 've added some scripts to capture more information about the job. If succeed,  I 'll share with you tomorrow. 
Thanks again. 



Haitao Yao
yao.erix@gmail.com
weibo: @haitao_yao
Skype:  haitao.yao.final

在 2012-8-21,下午5:22, Subir S 写道:

> I think it helps to get some context, if you can spin a small snippet from your script which causes this issue. This may give a better understanding of what you are trying to achieve, and if there is some work around. However this is upto you!
> 
> Thanks, Subir
> 
> On Tue, Aug 21, 2012 at 2:12 PM, Haitao Yao <ya...@gmail.com> wrote:
> I 'v found the reason: org.apache.hadoop.io.DataInputBuffer , there's a big byte array referenced by DataInputBuffer.
> 
> But there's no any way to close the buffer.
> 
> Is  there any other solution?
> 
> BTW, my script is more than 500 lines, sorry i can not show you.
> 
> here's the screenshot:
> 
> <aa.jpg>
> 
> 
> 
> 
> <bb.jpg>
> 
> 
> 
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
> 
> 在 2012-8-20,上午9:49, Haitao Yao 写道:
> 
>> Hi, all, 
>> 	I got an OOME , on org.apache.hadoop.mapreduce.Reducer$Context, here's the snapshot of the heap dump:
>> <aa.jpg>
>> 
>> Well, does pig have to report so many data through  the Reducer$Context? 
>> Can this be closed?
>> 
>> thanks.
>> 
>> 
>> 
>> Haitao Yao
>> yao.erix@gmail.com
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>> 
> 
> 


Re: Pig OutOfMemory Error on org.apache.hadoop.mapreduce.Reducer$Context

Posted by Subir S <su...@gmail.com>.
I think it helps to get some context, if you can spin a small snippet from
your script which causes this issue. This may give a better understanding
of what you are trying to achieve, and if there is some work around.
However this is upto you!

Thanks, Subir

On Tue, Aug 21, 2012 at 2:12 PM, Haitao Yao <ya...@gmail.com> wrote:

> I 'v found the reason: org.apache.hadoop.io.DataInputBuffer , there's a
> big byte array referenced by DataInputBuffer.
>
> But there's no any way to close the buffer.
>
> Is  there any other solution?
>
> BTW, my script is more than 500 lines, sorry i can not show you.
>
> here's the screenshot:
>
>
>
>
>
>
>
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
> 在 2012-8-20,上午9:49, Haitao Yao 写道:
>
> Hi, all,
> I got an OOME , on org.apache.hadoop.mapreduce.Reducer$Context, here's the
> snapshot of the heap dump:
> <aa.jpg>
>
> Well, does pig have to report so many data through  the Reducer$Context?
> Can this be closed?
>
> thanks.
>
>
>
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
>
>

Re: Pig OutOfMemory Error on org.apache.hadoop.mapreduce.Reducer$Context

Posted by Haitao Yao <ya...@gmail.com>.
I 'v found the reason: org.apache.hadoop.io.DataInputBuffer , there's a big byte array referenced by DataInputBuffer.

But there's no any way to close the buffer.

Is  there any other solution?

BTW, my script is more than 500 lines, sorry i can not show you.

here's the screenshot:










Haitao Yao
yao.erix@gmail.com
weibo: @haitao_yao
Skype:  haitao.yao.final

在 2012-8-20,上午9:49, Haitao Yao 写道:

> Hi, all, 
> 	I got an OOME , on org.apache.hadoop.mapreduce.Reducer$Context, here's the snapshot of the heap dump:
> <aa.jpg>
> 
> Well, does pig have to report so many data through  the Reducer$Context? 
> Can this be closed?
> 
> thanks.
> 
> 
> 
> Haitao Yao
> yao.erix@gmail.com
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>