You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2014/07/29 22:54:44 UTC

[jira] [Resolved] (MAPREDUCE-1247) Send out-of-band heartbeat to avoid fake lost tasktracker

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Allen Wittenauer resolved MAPREDUCE-1247.
-----------------------------------------

    Resolution: Fixed

> Send out-of-band heartbeat to avoid fake lost tasktracker
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-1247
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1247
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: ZhuGuanyin
>            Assignee: ZhuGuanyin
>
> Currently the TaskTracker report task status to jobtracker through heartbeat, sometimes if the tasktracker  lock the tasktracker to do some cleanup  job, like remove task temp data on disk, the heartbeat thread would hang for a long time while waiting for the lock, so the jobtracker just thought it had lost and would reschedule all its finished maps or un finished reduce on other tasktrackers, we call it "fake lost tasktracker", some times it doesn't acceptable especially when we run some large jobs.  So We introduce a out-of-band heartbeat mechanism to send an out-of-band heartbeat in that case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)