You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2009/05/05 18:08:30 UTC

[jira] Commented: (PIG-781) Error reporting for failed MR jobs

    [ https://issues.apache.org/jira/browse/PIG-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706083#action_12706083 ] 

Olga Natkovich commented on PIG-781:
------------------------------------

Hi Gunther,

The output looks good - this is exactly what we want.

This would solve issues for adhoc queries; however, we also need to make sure that users can detect this programatically. This has two part to it.

(1) The return code they see when a program partially successful. We need to add a new return code to http://wiki.apache.org/pig/PigErrorHandlingFunctionalSpecification for this.
(2) A per output done file either on DFS or on the local file system to indicate success.

I think, for now, we should at least do (1). (2) requires more though to make sure we don't leave done files behind forever.

> Error reporting for failed MR jobs
> ----------------------------------
>
>                 Key: PIG-781
>                 URL: https://issues.apache.org/jira/browse/PIG-781
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Gunther Hagleitner
>         Attachments: partial_failure.patch
>
>
> If we have multiple MR jobs to run and some of them fail the behavior of the system is to not stop on the first failure but to keep going. That way jobs that do not depend on the failed job might still succeed.
> The question is to how best report this scenario to a user. How do we tell which jobs failed and which didn't?
> One way could be to tie jobs to stores and report which store locations won't have data and which ones do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.