You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2017/10/13 18:08:00 UTC

[jira] [Commented] (MAPREDUCE-5196) CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203956#comment-16203956 ] 

Steve Loughran commented on MAPREDUCE-5196:
-------------------------------------------

Right. I'm staring at this code too, looking at the diffs between branch-2 & trunk in Task.done() and ending up with this JIRA. I'm working backwards from use of the commit operations, see. Bear in mind that my lack of understanding of MR internals means I could be utterly wrong about what I'm discussing: feel free to point this out.

This is in {{Task.done()}} in trunk:

{code}
    if (taskStatus.getRunState() == TaskStatus.State.PREEMPTED ) {
      // If we are preempted, do no output promotion; signal done and exit
      committer.commitTask(taskContext);
      umbilical.preempted(taskId, taskStatus);
      taskDone.set(true);
      reporter.stopCommunicationThread();
      return;
    }
{code}

However, the normal commit path first calls {{ isCommitRequired()}} to see if a commit is needed and, most critically, handles a raised IOE in  {{committer.commitTask(taskContext);}} by catching it & calling {{abortTask()}}

{code}
    // task can Commit now  
    try {
      LOG.info("Task " + taskId + " is allowed to commit now");
      committer.commitTask(taskContext);
      return;
    } catch (IOException iee) {
      LOG.warn("Failure committing: " + 
        StringUtils.stringifyException(iee));
      //if it couldn't commit a successfully then delete the output
      discardOutput(taskContext);
      throw iee;
    }
{code}

Shouldn't the preemption codepath be doing something similar?



> CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing 
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5196
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5196
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am, mrv2
>            Reporter: Carlo Curino
>            Assignee: Carlo Curino
>             Fix For: 3.0.0-alpha1
>
>         Attachments: MAPREDUCE-5196.1.patch, MAPREDUCE-5196.2.patch, MAPREDUCE-5196.3.patch, MAPREDUCE-5196.patch, MAPREDUCE-5196.patch
>
>
> This JIRA tracks a checkpoint-based AM preemption policy. The policy handles propagation of the preemption requests received from the RM to the appropriate tasks, and bookeeping of checkpoints. Actual checkpointing of the task state is handled in upcoming JIRAs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org