You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Russell Jurney <ru...@gmail.com> on 2013/12/20 21:31:32 UTC

ON ERROR

Does anyone think ON ERROR will ever get built into Pig? Would be so cool,
put pig above all other data flow tools in sophistication for large ETL.

I would work on that, if someone would pay me to do it.


-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Re: ON ERROR

Posted by Russell Jurney <ru...@gmail.com>.
So, to give this a little more detail - Pig currently will fail a 1PB map
reduce if one record is malformed. In most use cases, that is insane
behavior. The ON ERROR proposal lets you handle errors in a reasonable
manner: specify thresholds to fail at, and split errant records off into
another relation to study later.

On Friday, December 20, 2013, Russell Jurney wrote:

> http://wiki.apache.org/pig/PigErrorHandlingInScripts
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/PIG-2620
>
> On Friday, December 20, 2013, Ruslan Al-Fakikh wrote:
>
>> Hi Russell,
>>
>> Could you be more specific. What would this operator do?
>> Does it have something to do with control logic? (Like IF/ELSE, WHILE,
>> etc)
>> AFAIK, those are not present in Pig because it would make Pig less clean.
>>
>> Thanks
>>
>>
>> On Sat, Dec 21, 2013 at 12:31 AM, Russell Jurney
>> <ru...@gmail.com>wrote:
>>
>> > Does anyone think ON ERROR will ever get built into Pig? Would be so
>> cool,
>> > put pig above all other data flow tools in sophistication for large ETL.
>> >
>> > I would work on that, if someone would pay me to do it.
>> >
>> >
>> > --
>> > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
>> > datasyndrome.com
>> >
>>
>
>
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com<javascript:_e({}, 'cvml', 'russell.jurney@gmail.com');>
>  datasyndrome.com
>


-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Re: ON ERROR

Posted by Russell Jurney <ru...@gmail.com>.
http://wiki.apache.org/pig/PigErrorHandlingInScripts
https://issues.apache.org/jira/plugins/servlet/mobile#issue/PIG-2620

On Friday, December 20, 2013, Ruslan Al-Fakikh wrote:

> Hi Russell,
>
> Could you be more specific. What would this operator do?
> Does it have something to do with control logic? (Like IF/ELSE, WHILE, etc)
> AFAIK, those are not present in Pig because it would make Pig less clean.
>
> Thanks
>
>
> On Sat, Dec 21, 2013 at 12:31 AM, Russell Jurney
> <russell.jurney@gmail.com <javascript:;>>wrote:
>
> > Does anyone think ON ERROR will ever get built into Pig? Would be so
> cool,
> > put pig above all other data flow tools in sophistication for large ETL.
> >
> > I would work on that, if someone would pay me to do it.
> >
> >
> > --
> > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com<javascript:;>
> > datasyndrome.com
> >
>


-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Re: ON ERROR

Posted by Ruslan Al-Fakikh <me...@gmail.com>.
Hi Russell,

Could you be more specific. What would this operator do?
Does it have something to do with control logic? (Like IF/ELSE, WHILE, etc)
AFAIK, those are not present in Pig because it would make Pig less clean.

Thanks


On Sat, Dec 21, 2013 at 12:31 AM, Russell Jurney
<ru...@gmail.com>wrote:

> Does anyone think ON ERROR will ever get built into Pig? Would be so cool,
> put pig above all other data flow tools in sophistication for large ETL.
>
> I would work on that, if someone would pay me to do it.
>
>
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> datasyndrome.com
>