You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@pig.apache.org by felix gao <gr...@gmail.com> on 2010/12/20 21:07:12 UTC

How to trace back to the bad record caused the job to fail.

All,

Not sure if this is the right mailing list of this question. I am using pig
to do some data analysis and I am wondering if there a way to tell pig when
it encountered a bad log files either due to uncompression failures or what
ever caused the job to die, record the line and if possible the filename it
is working on in the some logs so I can go back to take a look at it later?

Thanks,

Felix

Re: How to trace back to the bad record caused the job to fail.

Posted by Dmitriy Ryaboy <dv...@gmail.com>.

Hanging UDFs: use @MonitoredUDF , and provide a custom error handler if
desired :) (pig 0.8 only)

D

On Mon, Dec 20, 2010 at 12:20 PM, felix gao <gr...@gmail.com> wrote:

> Matt,
>
> this is not always the case, for example, recently we had an UDF hanging,
> the job was killed because it took to long and there isn't anything in the
> web UI at all to indicate that.  Another example is we had some corrupted
> logs when loaded it and there is only the corrupted gzip error message in
> the UI, it is really hard to find which file it is working on to caused
> this
> error to occur when I have more 1000 files that were loaded.
>
> Felix
>
> On Mon, Dec 20, 2010 at 12:14 PM, Matt Tanquary <matt.tanquary@gmail.com
> >wrote:
>
> > Errors are dumped to the log files that you can view using the Map/Reduce
> > Administration web interface. Just click the job ID in the web interface,
> > then select either Map/Reduce (depending on where the failure occurred)
> and
> > you will get access to the logs from there.
> >
> > -M@
> >
> > On Mon, Dec 20, 2010 at 1:07 PM, felix gao <gr...@gmail.com> wrote:
> >
> > > All,
> > >
> > > Not sure if this is the right mailing list of this question. I am using
> > pig
> > > to do some data analysis and I am wondering if there a way to tell pig
> > when
> > > it encountered a bad log files either due to uncompression failures or
> > what
> > > ever caused the job to die, record the line and if possible the
> filename
> > it
> > > is working on in the some logs so I can go back to take a look at it
> > later?
> > >
> > > Thanks,
> > >
> > > Felix
> > >
> >
> >
> >
> > --
> > Have you thanked a teacher today? ---> http://www.liftateacher.org
> >
>

Re: How to trace back to the bad record caused the job to fail.

Posted by felix gao <gr...@gmail.com>.

Matt,

this is not always the case, for example, recently we had an UDF hanging,
the job was killed because it took to long and there isn't anything in the
web UI at all to indicate that.  Another example is we had some corrupted
logs when loaded it and there is only the corrupted gzip error message in
the UI, it is really hard to find which file it is working on to caused this
error to occur when I have more 1000 files that were loaded.

Felix

On Mon, Dec 20, 2010 at 12:14 PM, Matt Tanquary <ma...@gmail.com>wrote:

> Errors are dumped to the log files that you can view using the Map/Reduce
> Administration web interface. Just click the job ID in the web interface,
> then select either Map/Reduce (depending on where the failure occurred) and
> you will get access to the logs from there.
>
> -M@
>
> On Mon, Dec 20, 2010 at 1:07 PM, felix gao <gr...@gmail.com> wrote:
>
> > All,
> >
> > Not sure if this is the right mailing list of this question. I am using
> pig
> > to do some data analysis and I am wondering if there a way to tell pig
> when
> > it encountered a bad log files either due to uncompression failures or
> what
> > ever caused the job to die, record the line and if possible the filename
> it
> > is working on in the some logs so I can go back to take a look at it
> later?
> >
> > Thanks,
> >
> > Felix
> >
>
>
>
> --
> Have you thanked a teacher today? ---> http://www.liftateacher.org
>

Re: How to trace back to the bad record caused the job to fail.

Posted by Matt Tanquary <ma...@gmail.com>.

Errors are dumped to the log files that you can view using the Map/Reduce
Administration web interface. Just click the job ID in the web interface,
then select either Map/Reduce (depending on where the failure occurred) and
you will get access to the logs from there.

-M@

On Mon, Dec 20, 2010 at 1:07 PM, felix gao <gr...@gmail.com> wrote:

> All,
>
> Not sure if this is the right mailing list of this question. I am using pig
> to do some data analysis and I am wondering if there a way to tell pig when
> it encountered a bad log files either due to uncompression failures or what
> ever caused the job to die, record the line and if possible the filename it
> is working on in the some logs so I can go back to take a look at it later?
>
> Thanks,
>
> Felix
>

-- 
Have you thanked a teacher today? ---> http://www.liftateacher.org