You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "ZhuGuanyin (JIRA)" <ji...@apache.org> on 2009/03/05 07:37:56 UTC

[jira] Commented: (HADOOP-5407) Sometimes, Reduce tasks hang, State is unassigned

    [ https://issues.apache.org/jira/browse/HADOOP-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679091#action_12679091 ] 

ZhuGuanyin commented on HADOOP-5407:
------------------------------------

Today, one map task attempt hang, whose state is unassigned. The following log gives the normal log and the exception log.

Normal Log:
-------------------
2009-03-05 11:48:40,527 INFO  mapred.TaskTracker (TaskTracker.java:run(314)) - Received KillTaskAction for task: attempt_200903
032231_0534_m_000785_0
2009-03-05 11:48:40,527 INFO  mapred.TaskTracker (TaskTracker.java:purgeTask(1392)) - About to purge task: attempt_200903032231
_0534_m_000785_0
2009-03-05 11:48:40,528 INFO  mapred.TaskTracker (TaskTracker.java:addFreeSlot(1625)) - addFreeSlot : current free slots : 1
2009-03-05 11:48:40,537 WARN  mapred.TaskTracker (TaskTracker.java:reportTaskFinished(2583)) - Unknown child task finshed: atte
mpt_200903032231_0534_m_000785_0. Ignored.


Exception Log
--------------------------
2009-03-05 11:55:51,600 INFO  mapred.TaskTracker (TaskTracker.java:run(314)) - Received KillTaskAction for task: attempt_200903
032231_0541_m_000046_1
2009-03-05 11:55:51,603 INFO  mapred.TaskTracker (TaskTracker.java:purgeTask(1392)) - About to purge task: attempt_200903032231
_0541_m_000046_1
2009-03-05 11:55:51,603 INFO  mapred.TaskTracker (TaskTracker.java:reportDone(2022)) - Task attempt_200903032231_0541_m_000046_
1 is done.
2009-03-05 11:55:51,604 INFO  mapred.TaskTracker (TaskTracker.java:reportDone(2023)) - reported output size for attempt_2009030
32231_0541_m_000046_1  was 0

> Sometimes, Reduce tasks hang, State is unassigned
> -------------------------------------------------
>
>                 Key: HADOOP-5407
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5407
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: ZhuGuanyin
>
> Hi, all
> When our cluster runs for a long time, some reduce tasks running on some tasktrackers hang. Their states are UNASSIGNED.  Then, all reduce tasks on these tasktracker will hang.
> We kill the hang reduce task, then the reduce task attempt is re-scheduled to this tasktracker, the attempt task continues to hang. We fail it, it goes to another tasktracker, it is executed successfully. 
> Tasktracker which has hang reduce task will receive new reduce task, but the reduce  task continue to hang for ever.
> When we reboot the tasktracker machine, reduce task no longer hangs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.