You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Zhaohui Xin (JIRA)" <ji...@apache.org> on 2019/02/12 04:08:00 UTC

[jira] [Comment Edited] (MAPREDUCE-7169) Speculative attempts should not run on the same node

    [ https://issues.apache.org/jira/browse/MAPREDUCE-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765655#comment-16765655 ] 

Zhaohui Xin edited comment on MAPREDUCE-7169 at 2/12/19 4:07 AM:
-----------------------------------------------------------------

[~bibinchundatt], [~lichen1109]. There are two ways to solve this problem.
 * *Before Container Request:* Ignore data locality container request for speculative task attempt. 
 * *After Container Allocated:* When get the same node with last attempt, release this container and request another. Also, we should add a limited number when retrying.

I think the way 2 is better, because we should still honor data locality even if in speculative task attempt.


was (Author: uranus):
There are two ways to solve this problem.
 * *Before Container Request:* Ignore data locality container request for speculative task attempt. 
 * *After Container Allocated:* When get the same node with last attempt, release this container and request another. Also, we should add a limited number when retrying.

I think the way 2 is better, because we should still honor data locality even if in speculative task attempt.

> Speculative attempts should not run on the same node
> ----------------------------------------------------
>
>                 Key: MAPREDUCE-7169
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7169
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: yarn
>    Affects Versions: 2.7.2
>            Reporter: Lee chen
>            Assignee: Zhaohui Xin
>            Priority: Major
>         Attachments: image-2018-12-03-09-54-07-859.png
>
>
>           I found in all versions of yarn, Speculative Execution may set the speculative task to the node of  original task.What i have read is only it will try to have one more task attempt. haven't seen any place mentioning not on same node.It is unreasonable.If the node have some problems lead to tasks execution will be very slow. and then placement the speculative  task to same node cannot help the  problematic task.
>          In our cluster （version 2.7.2，2700 nodes），this phenomenon appear almost everyday.
>  !image-2018-12-03-09-54-07-859.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org