You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Neil Jonkers (JIRA)" <ji...@apache.org> on 2015/09/26 23:56:04 UTC

[jira] [Commented] (MAPREDUCE-6066) Speculative attempts should not run on the same node as their original attempt

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14909499#comment-14909499 ] 

Neil Jonkers commented on MAPREDUCE-6066:
-----------------------------------------

We also see this with Hadoop 2.6.0 using Capacity scheduler:
>From job conf:
yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

Relevant section from the AM logs:

2015-09-10 05:02:23,102 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1441860050089_0008_m_000450_0] using containerId: [container_1441860050089_0008_01_000452 on NM: [ip-172-31-8-34.ec2.internal:8041] 
2015-09-10 05:02:23,103 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1441860050089_0008_m_000450_0 TaskAttempt Transitioned from ASSIGNED to RUNNING 

2015-09-10 05:09:10,080 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: TaskAttempt: [attempt_1441860050089_0008_m_000450_1] using containerId: [container_1441860050089_0008_01_000558 on NM: [ip-172-31-8-34.ec2.internal:8041] 
2015-09-10 05:09:10,080 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1441860050089_0008_m_000450_1 TaskAttempt Transitioned from ASSIGNED to RUNNING 
2015-09-10 05:09:10,080 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator: ATTEMPT_START task_1441860050089_0008_m_000450 


> Speculative attempts should not run on the same node as their original attempt
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6066
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6066
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster, scheduler
>    Affects Versions: 2.5.0
>            Reporter: Todd Lipcon
>         Attachments: conf.xml
>
>
> I'm seeing a behavior on trunk with fair scheduler enabled where a speculative reduce attempt is getting run on the same node as its original attempt. This doesn't make sense -- the main reason for speculative execution is to deal with a slow node, so scheduling a second attempt on the same node would just make the problem worse if anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)