You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Devaraj Das (JIRA)" <ji...@apache.org> on 2007/05/25 13:04:16 UTC

[jira] Commented: (HADOOP-1431) Map tasks can't timeout for failing to call progress

    [ https://issues.apache.org/jira/browse/HADOOP-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499038 ] 

Devaraj Das commented on HADOOP-1431:
-------------------------------------

By the way, this issue exists for ReduceTasks also. We have threads there too for reporting progress (so one candidate there is the merge that might get stuck due to faulty user code). Owen, since it has been proven that this issue is not a cause for HADOOP-1374, could we postpone these fixes to 0.14 (mostly because we don't have real/reported instances of these problems yet). I am fine either way though.

> Map tasks can't timeout for failing to call progress
> ----------------------------------------------------
>
>                 Key: HADOOP-1431
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1431
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.13.0
>            Reporter: Owen O'Malley
>         Assigned To: Arun C Murthy
>             Fix For: 0.13.0
>
>
> Currently the map task runner creates a thread that calls progress every second to keep the system from killing the map if the sort takes too long. This is the wrong approach, because it will cause stuck tasks to not be killed. The right solution is to have the sort call progress as it actually makes progress. This is part of what is going on in HADOOP-1374. A map gets stuck at 100% progress, but not done.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.