You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Russell Jurney <ru...@gmail.com> on 2012/02/05 04:11:31 UTC

ONERROR

Did ONERROR ever get built?  I have a few bad datetimes out of many failing
to parse, and I don't want my entire pig script dying because I lost a few
rows.

http://wiki.apache.org/pig/PigErrorHandlingInScripts

-- 
Russell Jurney
twitter.com/rjurney
russell.jurney@gmail.com
datasyndrome.com

Re: ONERROR

Posted by Russell Jurney <ru...@gmail.com>.
Thanks, I'll add that to the patch
https://issues.apache.org/jira/browse/PIG-2515

On Mon, Feb 6, 2012 at 6:06 PM, Prashant Kommireddi <pr...@gmail.com>wrote:

> Russell, you could use PigWarning to report counters
> http://pig.apache.org/docs/r0.9.1/api/org/apache/pig/PigWarning.html
> This should display as counters on the JobTracker, please make sure you
> have "aggregate.warning" set to true (by default it is true, but just in
> case)
>
> try {
>
> //foo bar
>
>    } catch (IndexOutOfBoundsException ie) {
>            String msg = "Some message";
>            warn(msg + " --> " + ie.toString(), PigWarning.UDF_WARNING_2);
>            return null;
>        } catch (NullPointerException npe) {
>            warn(npe.toString(), PigWarning.UDF_WARNING_3);
>            return null;
>        } catch (ClassCastException cce) {
>            warn(cce.toString(), PigWarning.UDF_WARNING_4);
>            return null;
>        }
>
> On Mon, Feb 6, 2012 at 6:01 PM, Russell Jurney <russell.jurney@gmail.com
> >wrote:
>
> > Is there a way to report the records we null through counters or
> something?
> >
> > On Mon, Feb 6, 2012 at 5:22 PM, Dmitriy Ryaboy <dv...@gmail.com>
> wrote:
> >
> > > Try / catch / return null seems like the exactly right thing to do.
> > > You will not a lot of string parsing UDFs in piggybank work that way.
> > >
> > > On Mon, Feb 6, 2012 at 3:27 PM, Russell Jurney <
> russell.jurney@gmail.com
> > > >wrote:
> > >
> > > > I just had to copy CustomFormatToISO and create
> > > ForgivingCustomFormatToISO
> > > > that does a try/catch/return null, because 0.01% of my records have
> bad
> > > > RFC1123 dates in them.  This seems very, very wrong.
> > > >
> > > > Is there a better way than this at the moment, or is this something
> > that
> > > > must be addressed with ONERROR?
> > > >
> > > > Russ
> > > >
> > > > On Sun, Feb 5, 2012 at 9:04 PM, Daniel Dai <da...@hortonworks.com>
> > > wrote:
> > > >
> > > > > No, there is no ONERROR handle right now.
> > > > >
> > > > > Daniel
> > > > >
> > > > > On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney <
> > > russell.jurney@gmail.com
> > > > >
> > > > > wrote:
> > > > > > Did ONERROR ever get built?  I have a few bad datetimes out of
> many
> > > > > failing
> > > > > > to parse, and I don't want my entire pig script dying because I
> > lost
> > > a
> > > > > few
> > > > > > rows.
> > > > > >
> > > > > > http://wiki.apache.org/pig/PigErrorHandlingInScripts
> > > > > >
> > > > > > --
> > > > > > Russell Jurney
> > > > > > twitter.com/rjurney
> > > > > > russell.jurney@gmail.com
> > > > > > datasyndrome.com
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Russell Jurney
> > > > twitter.com/rjurney
> > > > russell.jurney@gmail.com
> > > > datasyndrome.com
> > > >
> > >
> >
> >
> >
> > --
> > Russell Jurney
> > twitter.com/rjurney
> > russell.jurney@gmail.com
> > datasyndrome.com
> >
>



-- 
Russell Jurney
twitter.com/rjurney
russell.jurney@gmail.com
datasyndrome.com

Re: ONERROR

Posted by Prashant Kommireddi <pr...@gmail.com>.
Russell, you could use PigWarning to report counters
http://pig.apache.org/docs/r0.9.1/api/org/apache/pig/PigWarning.html
This should display as counters on the JobTracker, please make sure you
have "aggregate.warning" set to true (by default it is true, but just in
case)

try {

//foo bar

    } catch (IndexOutOfBoundsException ie) {
            String msg = "Some message";
            warn(msg + " --> " + ie.toString(), PigWarning.UDF_WARNING_2);
            return null;
        } catch (NullPointerException npe) {
            warn(npe.toString(), PigWarning.UDF_WARNING_3);
            return null;
        } catch (ClassCastException cce) {
            warn(cce.toString(), PigWarning.UDF_WARNING_4);
            return null;
        }

On Mon, Feb 6, 2012 at 6:01 PM, Russell Jurney <ru...@gmail.com>wrote:

> Is there a way to report the records we null through counters or something?
>
> On Mon, Feb 6, 2012 at 5:22 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>
> > Try / catch / return null seems like the exactly right thing to do.
> > You will not a lot of string parsing UDFs in piggybank work that way.
> >
> > On Mon, Feb 6, 2012 at 3:27 PM, Russell Jurney <russell.jurney@gmail.com
> > >wrote:
> >
> > > I just had to copy CustomFormatToISO and create
> > ForgivingCustomFormatToISO
> > > that does a try/catch/return null, because 0.01% of my records have bad
> > > RFC1123 dates in them.  This seems very, very wrong.
> > >
> > > Is there a better way than this at the moment, or is this something
> that
> > > must be addressed with ONERROR?
> > >
> > > Russ
> > >
> > > On Sun, Feb 5, 2012 at 9:04 PM, Daniel Dai <da...@hortonworks.com>
> > wrote:
> > >
> > > > No, there is no ONERROR handle right now.
> > > >
> > > > Daniel
> > > >
> > > > On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney <
> > russell.jurney@gmail.com
> > > >
> > > > wrote:
> > > > > Did ONERROR ever get built?  I have a few bad datetimes out of many
> > > > failing
> > > > > to parse, and I don't want my entire pig script dying because I
> lost
> > a
> > > > few
> > > > > rows.
> > > > >
> > > > > http://wiki.apache.org/pig/PigErrorHandlingInScripts
> > > > >
> > > > > --
> > > > > Russell Jurney
> > > > > twitter.com/rjurney
> > > > > russell.jurney@gmail.com
> > > > > datasyndrome.com
> > > >
> > >
> > >
> > >
> > > --
> > > Russell Jurney
> > > twitter.com/rjurney
> > > russell.jurney@gmail.com
> > > datasyndrome.com
> > >
> >
>
>
>
> --
> Russell Jurney
> twitter.com/rjurney
> russell.jurney@gmail.com
> datasyndrome.com
>

Re: ONERROR

Posted by Russell Jurney <ru...@gmail.com>.
Is there a way to report the records we null through counters or something?

On Mon, Feb 6, 2012 at 5:22 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Try / catch / return null seems like the exactly right thing to do.
> You will not a lot of string parsing UDFs in piggybank work that way.
>
> On Mon, Feb 6, 2012 at 3:27 PM, Russell Jurney <russell.jurney@gmail.com
> >wrote:
>
> > I just had to copy CustomFormatToISO and create
> ForgivingCustomFormatToISO
> > that does a try/catch/return null, because 0.01% of my records have bad
> > RFC1123 dates in them.  This seems very, very wrong.
> >
> > Is there a better way than this at the moment, or is this something that
> > must be addressed with ONERROR?
> >
> > Russ
> >
> > On Sun, Feb 5, 2012 at 9:04 PM, Daniel Dai <da...@hortonworks.com>
> wrote:
> >
> > > No, there is no ONERROR handle right now.
> > >
> > > Daniel
> > >
> > > On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney <
> russell.jurney@gmail.com
> > >
> > > wrote:
> > > > Did ONERROR ever get built?  I have a few bad datetimes out of many
> > > failing
> > > > to parse, and I don't want my entire pig script dying because I lost
> a
> > > few
> > > > rows.
> > > >
> > > > http://wiki.apache.org/pig/PigErrorHandlingInScripts
> > > >
> > > > --
> > > > Russell Jurney
> > > > twitter.com/rjurney
> > > > russell.jurney@gmail.com
> > > > datasyndrome.com
> > >
> >
> >
> >
> > --
> > Russell Jurney
> > twitter.com/rjurney
> > russell.jurney@gmail.com
> > datasyndrome.com
> >
>



-- 
Russell Jurney
twitter.com/rjurney
russell.jurney@gmail.com
datasyndrome.com

Re: ONERROR

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Try / catch / return null seems like the exactly right thing to do.
You will not a lot of string parsing UDFs in piggybank work that way.

On Mon, Feb 6, 2012 at 3:27 PM, Russell Jurney <ru...@gmail.com>wrote:

> I just had to copy CustomFormatToISO and create ForgivingCustomFormatToISO
> that does a try/catch/return null, because 0.01% of my records have bad
> RFC1123 dates in them.  This seems very, very wrong.
>
> Is there a better way than this at the moment, or is this something that
> must be addressed with ONERROR?
>
> Russ
>
> On Sun, Feb 5, 2012 at 9:04 PM, Daniel Dai <da...@hortonworks.com> wrote:
>
> > No, there is no ONERROR handle right now.
> >
> > Daniel
> >
> > On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney <russell.jurney@gmail.com
> >
> > wrote:
> > > Did ONERROR ever get built?  I have a few bad datetimes out of many
> > failing
> > > to parse, and I don't want my entire pig script dying because I lost a
> > few
> > > rows.
> > >
> > > http://wiki.apache.org/pig/PigErrorHandlingInScripts
> > >
> > > --
> > > Russell Jurney
> > > twitter.com/rjurney
> > > russell.jurney@gmail.com
> > > datasyndrome.com
> >
>
>
>
> --
> Russell Jurney
> twitter.com/rjurney
> russell.jurney@gmail.com
> datasyndrome.com
>

Re: ONERROR

Posted by Russell Jurney <ru...@gmail.com>.
I just had to copy CustomFormatToISO and create ForgivingCustomFormatToISO
that does a try/catch/return null, because 0.01% of my records have bad
RFC1123 dates in them.  This seems very, very wrong.

Is there a better way than this at the moment, or is this something that
must be addressed with ONERROR?

Russ

On Sun, Feb 5, 2012 at 9:04 PM, Daniel Dai <da...@hortonworks.com> wrote:

> No, there is no ONERROR handle right now.
>
> Daniel
>
> On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney <ru...@gmail.com>
> wrote:
> > Did ONERROR ever get built?  I have a few bad datetimes out of many
> failing
> > to parse, and I don't want my entire pig script dying because I lost a
> few
> > rows.
> >
> > http://wiki.apache.org/pig/PigErrorHandlingInScripts
> >
> > --
> > Russell Jurney
> > twitter.com/rjurney
> > russell.jurney@gmail.com
> > datasyndrome.com
>



-- 
Russell Jurney
twitter.com/rjurney
russell.jurney@gmail.com
datasyndrome.com

Re: ONERROR

Posted by Daniel Dai <da...@hortonworks.com>.
No, there is no ONERROR handle right now.

Daniel

On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney <ru...@gmail.com> wrote:
> Did ONERROR ever get built?  I have a few bad datetimes out of many failing
> to parse, and I don't want my entire pig script dying because I lost a few
> rows.
>
> http://wiki.apache.org/pig/PigErrorHandlingInScripts
>
> --
> Russell Jurney
> twitter.com/rjurney
> russell.jurney@gmail.com
> datasyndrome.com