You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Haibo Chen (JIRA)" <ji...@apache.org> on 2016/07/21 08:04:20 UTC

[jira] [Commented] (MAPREDUCE-6679) on node failure, only restart mappers whose output is not copied

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387330#comment-15387330 ] 

Haibo Chen commented on MAPREDUCE-6679:
---------------------------------------

[~alvin.chyan@gmail.com] We could avoid rescheduling a succeeded map task on a bad node only if all reducers have copied its output.  Shuffle requests are served by ShuffleHandlers on NM that do not communicate with MR AM, so I believe MR AM has no way to tell if all reducers have fetched from a map task.  On the other hand, even if all reducers have copied output from a succeeded mapper, there is still a possibility that we need to reschedule that map task. Like you said, any of the reducers can fail after they have copied the output, the second attempt of the failed reducer task will likely fail as well because map output is on a bad node.

> on node failure, only restart mappers whose output is not copied
> ----------------------------------------------------------------
>
>                 Key: MAPREDUCE-6679
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6679
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 2.7.0
>            Reporter: Alvin Chyan
>            Priority: Minor
>
> When we detect a bad node, we reschedule all succeeded map tasks on that node in JobImpl.actOnUnusableNode. Wouldn't we be able to get away with only rescheduling the map tasks that have not had their outputs copied to a reducer already?
> One consideration could be that the reducer that fetched the mapper output is then killed itself. However, in testing, it seems that once a reducer has moved past the shuffle phase and is reducing, even if the mapper node fails, the mappers don't get rescheduled. The same mechanism that occurs then if a reducer dies can then be applied in this scenario.
> This is helpful in general, but is especially beneficial in cloud environments that offer spot/preemptible instances. As long as reducers are running to continually fetch mapper outputs, the job can make progress as long as the preemptible instances stay up long enough for a map task to complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org