You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Santhosh Srinivasan <sm...@yahoo-inc.com> on 2008/10/21 00:29:08 UTC

Requirements for Pig Error Handling

Dear Users,

The requirements document for error handling in Pig is now published at:
http://wiki.apache.org/pig/PigErrorHandling
Please take a look and feel free to provide feedback.

Thanks,
Santhosh

RE: Requirements for Pig Error Handling

Posted by Santhosh Srinivasan <sm...@yahoo-inc.com>.
After incorporating the feedback that I received, I have updated the
document. The final document is at:
http://wiki.apache.org/pig/PigErrorHandling

Thanks,
Santhosh 

-----Original Message-----
From: Santhosh Srinivasan 
Sent: Wednesday, October 22, 2008 6:17 PM
To: 'pig-user@incubator.apache.org'
Subject: RE: Requirements for Pig Error Handling

Hi Alan,

Thanks for the detailed comments.

1. I incorporated your comment. All error messages will have an error
code.

2. Its already mentioned in the section on Error Handling.

3. I have an example for the semantic error. Will add a runtime Hadoop
error.

4. Hadoop error will have different information indicating Hadoop as the
source.

5. I have added some examples to explain this point.

6. Only aggregation will be turned off. Probably we might want to add a
switch to turn off warnings completely.

Thanks,
Santhosh 

-----Original Message-----
From: Alan Gates [mailto:gates@yahoo-inc.com] 
Sent: Tuesday, October 21, 2008 11:48 AM
To: pig-user@incubator.apache.org
Subject: Re: Requirements for Pig Error Handling

Comments/questions:

1) "Error codes will be devised for common error messages".  All  
errors should have codes.  We will probably need a catch all category  
(like "internal error" or something).  Giving all error messages  
codes makes it much easier to write user manuals.

2) I think you are assuming that the stack traces etc. that is  
currently output will be written to a log, but I don't see that spell  
out.  You mention that users are responsible for purging it.  You  
also need to specify where the log will be located.

3) A few explicit examples of how things will look would be helpful.   
For example, if you showed a semantic error and a runtime hadoop  
error, what was printed to the screen in each case, and what was  
written to the log in each case.

4) How will errors from hadoop be shown differently than errors from  
pig?  Do you mean they'll have a different error code?  Will they  
contain different information?  Will they be written to different  
locations?

5) What does warning aggregation look like?  Will the user get  
something like:  "This query had 500 warnings, see logs for details"  
or will it be "The warning "divide by 0" was seen 498 times and the  
warning "my udf flopped" was seen 2 times" (that is summary of all  
warnings or summary by warning type)?  Will the full warning info be  
written to the logs, or only the summary?

6)  When users turn off warning aggregation, does that mean that the  
warnings are thrown away or that they are printed to the screen  
individually?  That is, does it turn off warnings or turn off  
aggregation?

Alan.

On Oct 20, 2008, at 3:29 PM, Santhosh Srinivasan wrote:

> Dear Users,
>
> The requirements document for error handling in Pig is now  
> published at:
> http://wiki.apache.org/pig/PigErrorHandling
> Please take a look and feel free to provide feedback.
>
> Thanks,
> Santhosh


Re: Requirements for Pig Error Handling

Posted by Alan Gates <ga...@yahoo-inc.com>.
Currently, in the types branch, we emit a null and continue.   In  
released code it is an error and the entire job comes to a stop.

Alan.
On Oct 23, 2008, at 3:21 PM, pi song wrote:

> How do we currently deal with runtime errors like "divide by 0" ?  
> Do we skip
> the records or just redirect to error output file?
> Pi
> On Thu, Oct 23, 2008 at 1:16 PM, Santhosh Srinivasan <sms@yahoo- 
> inc.com>wrote:
>
>> Hi Alan,
>>
>> Thanks for the detailed comments.
>>
>> 1. I incorporated your comment. All error messages will have an error
>> code.
>>
>> 2. Its already mentioned in the section on Error Handling.
>>
>> 3. I have an example for the semantic error. Will add a runtime  
>> Hadoop
>> error.
>>
>> 4. Hadoop error will have different information indicating Hadoop  
>> as the
>> source.
>>
>> 5. I have added some examples to explain this point.
>>
>> 6. Only aggregation will be turned off. Probably we might want to  
>> add a
>> switch to turn off warnings completely.
>>
>> Thanks,
>> Santhosh
>>
>> -----Original Message-----
>> From: Alan Gates [mailto:gates@yahoo-inc.com]
>> Sent: Tuesday, October 21, 2008 11:48 AM
>> To: pig-user@incubator.apache.org
>> Subject: Re: Requirements for Pig Error Handling
>>
>> Comments/questions:
>>
>> 1) "Error codes will be devised for common error messages".  All
>> errors should have codes.  We will probably need a catch all category
>> (like "internal error" or something).  Giving all error messages
>> codes makes it much easier to write user manuals.
>>
>> 2) I think you are assuming that the stack traces etc. that is
>> currently output will be written to a log, but I don't see that spell
>> out.  You mention that users are responsible for purging it.  You
>> also need to specify where the log will be located.
>>
>> 3) A few explicit examples of how things will look would be helpful.
>> For example, if you showed a semantic error and a runtime hadoop
>> error, what was printed to the screen in each case, and what was
>> written to the log in each case.
>>
>> 4) How will errors from hadoop be shown differently than errors from
>> pig?  Do you mean they'll have a different error code?  Will they
>> contain different information?  Will they be written to different
>> locations?
>>
>> 5) What does warning aggregation look like?  Will the user get
>> something like:  "This query had 500 warnings, see logs for details"
>> or will it be "The warning "divide by 0" was seen 498 times and the
>> warning "my udf flopped" was seen 2 times" (that is summary of all
>> warnings or summary by warning type)?  Will the full warning info be
>> written to the logs, or only the summary?
>>
>> 6)  When users turn off warning aggregation, does that mean that the
>> warnings are thrown away or that they are printed to the screen
>> individually?  That is, does it turn off warnings or turn off
>> aggregation?
>>
>> Alan.
>>
>> On Oct 20, 2008, at 3:29 PM, Santhosh Srinivasan wrote:
>>
>>> Dear Users,
>>>
>>> The requirements document for error handling in Pig is now
>>> published at:
>>> http://wiki.apache.org/pig/PigErrorHandling
>>> Please take a look and feel free to provide feedback.
>>>
>>> Thanks,
>>> Santhosh
>>
>>


Re: Requirements for Pig Error Handling

Posted by pi song <pi...@gmail.com>.
How do we currently deal with runtime errors like "divide by 0" ? Do we skip
the records or just redirect to error output file?
Pi
On Thu, Oct 23, 2008 at 1:16 PM, Santhosh Srinivasan <sm...@yahoo-inc.com>wrote:

> Hi Alan,
>
> Thanks for the detailed comments.
>
> 1. I incorporated your comment. All error messages will have an error
> code.
>
> 2. Its already mentioned in the section on Error Handling.
>
> 3. I have an example for the semantic error. Will add a runtime Hadoop
> error.
>
> 4. Hadoop error will have different information indicating Hadoop as the
> source.
>
> 5. I have added some examples to explain this point.
>
> 6. Only aggregation will be turned off. Probably we might want to add a
> switch to turn off warnings completely.
>
> Thanks,
> Santhosh
>
> -----Original Message-----
> From: Alan Gates [mailto:gates@yahoo-inc.com]
> Sent: Tuesday, October 21, 2008 11:48 AM
> To: pig-user@incubator.apache.org
> Subject: Re: Requirements for Pig Error Handling
>
> Comments/questions:
>
> 1) "Error codes will be devised for common error messages".  All
> errors should have codes.  We will probably need a catch all category
> (like "internal error" or something).  Giving all error messages
> codes makes it much easier to write user manuals.
>
> 2) I think you are assuming that the stack traces etc. that is
> currently output will be written to a log, but I don't see that spell
> out.  You mention that users are responsible for purging it.  You
> also need to specify where the log will be located.
>
> 3) A few explicit examples of how things will look would be helpful.
> For example, if you showed a semantic error and a runtime hadoop
> error, what was printed to the screen in each case, and what was
> written to the log in each case.
>
> 4) How will errors from hadoop be shown differently than errors from
> pig?  Do you mean they'll have a different error code?  Will they
> contain different information?  Will they be written to different
> locations?
>
> 5) What does warning aggregation look like?  Will the user get
> something like:  "This query had 500 warnings, see logs for details"
> or will it be "The warning "divide by 0" was seen 498 times and the
> warning "my udf flopped" was seen 2 times" (that is summary of all
> warnings or summary by warning type)?  Will the full warning info be
> written to the logs, or only the summary?
>
> 6)  When users turn off warning aggregation, does that mean that the
> warnings are thrown away or that they are printed to the screen
> individually?  That is, does it turn off warnings or turn off
> aggregation?
>
> Alan.
>
> On Oct 20, 2008, at 3:29 PM, Santhosh Srinivasan wrote:
>
> > Dear Users,
> >
> > The requirements document for error handling in Pig is now
> > published at:
> > http://wiki.apache.org/pig/PigErrorHandling
> > Please take a look and feel free to provide feedback.
> >
> > Thanks,
> > Santhosh
>
>

RE: Requirements for Pig Error Handling

Posted by Santhosh Srinivasan <sm...@yahoo-inc.com>.
Hi Alan,

Thanks for the detailed comments.

1. I incorporated your comment. All error messages will have an error
code.

2. Its already mentioned in the section on Error Handling.

3. I have an example for the semantic error. Will add a runtime Hadoop
error.

4. Hadoop error will have different information indicating Hadoop as the
source.

5. I have added some examples to explain this point.

6. Only aggregation will be turned off. Probably we might want to add a
switch to turn off warnings completely.

Thanks,
Santhosh 

-----Original Message-----
From: Alan Gates [mailto:gates@yahoo-inc.com] 
Sent: Tuesday, October 21, 2008 11:48 AM
To: pig-user@incubator.apache.org
Subject: Re: Requirements for Pig Error Handling

Comments/questions:

1) "Error codes will be devised for common error messages".  All  
errors should have codes.  We will probably need a catch all category  
(like "internal error" or something).  Giving all error messages  
codes makes it much easier to write user manuals.

2) I think you are assuming that the stack traces etc. that is  
currently output will be written to a log, but I don't see that spell  
out.  You mention that users are responsible for purging it.  You  
also need to specify where the log will be located.

3) A few explicit examples of how things will look would be helpful.   
For example, if you showed a semantic error and a runtime hadoop  
error, what was printed to the screen in each case, and what was  
written to the log in each case.

4) How will errors from hadoop be shown differently than errors from  
pig?  Do you mean they'll have a different error code?  Will they  
contain different information?  Will they be written to different  
locations?

5) What does warning aggregation look like?  Will the user get  
something like:  "This query had 500 warnings, see logs for details"  
or will it be "The warning "divide by 0" was seen 498 times and the  
warning "my udf flopped" was seen 2 times" (that is summary of all  
warnings or summary by warning type)?  Will the full warning info be  
written to the logs, or only the summary?

6)  When users turn off warning aggregation, does that mean that the  
warnings are thrown away or that they are printed to the screen  
individually?  That is, does it turn off warnings or turn off  
aggregation?

Alan.

On Oct 20, 2008, at 3:29 PM, Santhosh Srinivasan wrote:

> Dear Users,
>
> The requirements document for error handling in Pig is now  
> published at:
> http://wiki.apache.org/pig/PigErrorHandling
> Please take a look and feel free to provide feedback.
>
> Thanks,
> Santhosh


Re: Requirements for Pig Error Handling

Posted by Alan Gates <ga...@yahoo-inc.com>.
Comments/questions:

1) "Error codes will be devised for common error messages".  All  
errors should have codes.  We will probably need a catch all category  
(like "internal error" or something).  Giving all error messages  
codes makes it much easier to write user manuals.

2) I think you are assuming that the stack traces etc. that is  
currently output will be written to a log, but I don't see that spell  
out.  You mention that users are responsible for purging it.  You  
also need to specify where the log will be located.

3) A few explicit examples of how things will look would be helpful.   
For example, if you showed a semantic error and a runtime hadoop  
error, what was printed to the screen in each case, and what was  
written to the log in each case.

4) How will errors from hadoop be shown differently than errors from  
pig?  Do you mean they'll have a different error code?  Will they  
contain different information?  Will they be written to different  
locations?

5) What does warning aggregation look like?  Will the user get  
something like:  "This query had 500 warnings, see logs for details"  
or will it be "The warning "divide by 0" was seen 498 times and the  
warning "my udf flopped" was seen 2 times" (that is summary of all  
warnings or summary by warning type)?  Will the full warning info be  
written to the logs, or only the summary?

6)  When users turn off warning aggregation, does that mean that the  
warnings are thrown away or that they are printed to the screen  
individually?  That is, does it turn off warnings or turn off  
aggregation?

Alan.

On Oct 20, 2008, at 3:29 PM, Santhosh Srinivasan wrote:

> Dear Users,
>
> The requirements document for error handling in Pig is now  
> published at:
> http://wiki.apache.org/pig/PigErrorHandling
> Please take a look and feel free to provide feedback.
>
> Thanks,
> Santhosh