You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Wangda Tan (JIRA)" <ji...@apache.org> on 2016/05/05 00:42:12 UTC
[jira] [Commented] (MAPREDUCE-6689) MapReduce job can infinitely
increasing number of reducer resource requests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271710#comment-15271710 ]
Wangda Tan commented on MAPREDUCE-6689:
---------------------------------------
One of the quick solution for this issue is: modify {{preemptReducesIfNeeded}} to returned if preemption happens. If preemption happens, skip the next {{scheduleReduces}}.
CC: [~kasha], [~jlowe].
> MapReduce job can infinitely increasing number of reducer resource requests
> ---------------------------------------------------------------------------
>
> Key: MAPREDUCE-6689
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6689
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Wangda Tan
> Assignee: Wangda Tan
> Priority: Blocker
>
> We have seen this issue from one of our clusters: when running terasort map-reduce job, some mappers failed after reducer started, and then MR AM tries to preempt reducers to schedule these failed mappers.
> After that, MR AM enters an infinite loop, for every RMContainerAllocator#heartbeat run, it:
> - In {{preemptReducesIfNeeded}}, it cancels all scheduled reducer requests. (total scheduled reducers = 1024)
> - Then, in {{scheduleReduces}}, it ramps up all reducers (total = 1024).
> As a result, we can see total #requested-containers increased 1024 for every MRAM-RM heartbeat (1 sec per heartbeat). The AM is hanging for 18+ hours, so we get 18 * 3600 * 1024 ~ 66M+ requested containers in RM side.
> And this bug also triggered YARN-4844, which makes RM stop scheduling anything.
> Thanks to [~sidharta-s] for helping with analysis.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org