You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Craig Welch (JIRA)" <ji...@apache.org> on 2015/02/10 22:11:13 UTC

[jira] [Commented] (MAPREDUCE-5547) Job history should not be flushed to JHS until AM gets unregistered

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314937#comment-14314937 ] 

Craig Welch commented on MAPREDUCE-5547:
----------------------------------------

So, I think it will be very problematic to move the unregistration of the job ahead of the upload of the job history logs - as far as I know the grace period is just in the application master waiting and still accepting requests, the resource manager immediately begins forwarding clients at unregistration, which means that if we unregister and then upload the job history file we will definitely have a time period where clients will be sent to the job history server and will fail.  Also, resource manager restarts are far less frequent then job completion/check occurrences, we don't want to cause problems with the latter to improve the situation with the former (I think [~zjshen] & [~jlowe] made this point above, I concur...).  I think the solution needs to be something like a rollback - where a job can change it's state back to one which causes the client to go to the am again, while clients looking directly at the job history server may still get different results, client going through the rm to get at state will again be directed to the am which has newly restarted.  We could provide a mechanism to allow the am to purge it's state from the jobhistory server as well if this was a significant concern, to achieve full "state correctness" for this case.

> Job history should not be flushed to JHS until AM gets unregistered
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5547
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5547
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>         Attachments: MAPREDUCE-5547.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)