You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by shan s <my...@gmail.com> on 2012/04/20 22:34:05 UTC

Confusing progress and error reporting

I have a job with  file strd.txt is 1.5. gb with 2,458,220  records. I get
24 maps assigned for the first job.

The first 9 maps are working on the same file, with each split working on ~
333,849 records. Out of which Map m_00000 to map0006 shows complete taking
<= 6 seconds.



However the last task, map m_0007 hangs/fails.  Also m_0008 through m_00023
are shown as complete.

How to interpret this?



Also job m_007 fails due to Java Heap Space error, but this error is
eventually overwritten with timeout error , masking the real error.

In the pseudo cluster I never saw the Java Heap Space error, it always
showed me timeout error.

I am on cdh3u2.

Thanks, Prashant.

Re: Confusing progress and error reporting

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Please send the pig script and describe your input data characteristics.

D

On Fri, Apr 20, 2012 at 1:34 PM, shan s <my...@gmail.com> wrote:
> I have a job with  file strd.txt is 1.5. gb with 2,458,220  records. I get
> 24 maps assigned for the first job.
>
> The first 9 maps are working on the same file, with each split working on ~
> 333,849 records. Out of which Map m_00000 to map0006 shows complete taking
> <= 6 seconds.
>
>
>
> However the last task, map m_0007 hangs/fails.  Also m_0008 through m_00023
> are shown as complete.
>
> How to interpret this?
>
>
>
> Also job m_007 fails due to Java Heap Space error, but this error is
> eventually overwritten with timeout error , masking the real error.
>
> In the pseudo cluster I never saw the Java Heap Space error, it always
> showed me timeout error.
>
> I am on cdh3u2.
>
> Thanks, Prashant.