You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Mikayla Konst (JIRA)" <ji...@apache.org> on 2018/12/01 02:23:00 UTC

[jira] [Created] (MAPREDUCE-7168) Add option to not kill already-done map tasks when node becomes unusable

Mikayla Konst created MAPREDUCE-7168:
----------------------------------------

             Summary: Add option to not kill already-done map tasks when node becomes unusable
                 Key: MAPREDUCE-7168
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7168
             Project: Hadoop Map/Reduce
          Issue Type: New Feature
          Components: mrv2
    Affects Versions: 2.9.2
         Environment: Google Compute Engine (Dataproc), Java 8
            Reporter: Mikayla Konst


When a node becomes unusable, if there are still reduce tasks running, all completed map tasks that were run on that node are killed so that they can be re-run on a different node. This is because the node can no longer serve shuffle data, so the map task output cannot be fetched by the reducers.

If map tasks do not write their shuffle data locally, killing already-done map tasks will make the job lose map progress unnecessarily. This change prevents map progress from being lost when shuffle data is not written locally by providing a property mapreduce.map.rerun-if-node-unusable that can be set to false to prevent killing already-done map tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org