You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Junping Du (JIRA)" <ji...@apache.org> on 2015/11/04 18:23:28 UTC

[jira] [Updated] (MAPREDUCE-5485) Allow repeating job commit by extending OutputCommitter API

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Junping Du updated MAPREDUCE-5485:
----------------------------------
    Attachment: MAPREDUCE-5485-v1.patch

Attach a new patch to address Bikas comments above, include:
1. Make retry logic go to committer.commitJob() rather than MRAppMaster
2. It will fail AM instead of Job when exception happens during jobCommit if commitJob() is repeatable.
3. Add related unit tests.
Verify this feature works well on a small scale cluster that kill AM during job committing stage, and the job can continue and succeed after AM restarted.

> Allow repeating job commit by extending OutputCommitter API
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-5485
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5485
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 2.1.0-beta
>            Reporter: Nemon Lou
>            Assignee: Junping Du
>         Attachments: MAPREDUCE-5485-demo-2.patch, MAPREDUCE-5485-demo.patch, MAPREDUCE-5485-v1.patch
>
>
> There are chances MRAppMaster crush during job committing,or NodeManager restart cause the committing AM exit due to container expire.In these cases ,the job will fail.
> However,some jobs can redo commit so failing the job becomes unnecessary.
> Let clients tell AM to allow redo commit or not is a better choice.
> This idea comes from Jason Lowe's comments in MAPREDUCE-4819 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)