You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Koji Noguchi (JIRA)" <ji...@apache.org> on 2007/09/29 09:09:50 UTC

[jira] Updated: (HADOOP-1970) tasktracker hang in reduce. Deadlock between main and comm thread

     [ https://issues.apache.org/jira/browse/HADOOP-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated HADOOP-1970:
---------------------------------

    Description: 
Saw one reduce task stuck on copy.
jstack on the reduce task(task_200709272248_0001_r_000150_0)  process showed 

{noformat} 
Found one Java-level deadlock:
=============================
"Comm thread for task_200709272248_0001_r_000150_0":
  waiting to lock monitor 0x08144020 (object 0xd4e30aa8, a org.apache.hadoop.util.Progress),
  which is held by "main"
"main":
  waiting to lock monitor 0x08144084 (object 0xd4e30958, a org.apache.hadoop.util.Progress),
  which is held by "Comm thread for task_200709272248_0001_r_000150_0"

Java stack information for the threads listed above:
===================================================
"Comm thread for task_200709272248_0001_r_000150_0":
        at org.apache.hadoop.util.Progress.toString(Progress.java:113)
        - waiting to lock <0xd4e30aa8> (a org.apache.hadoop.util.Progress)
        at org.apache.hadoop.util.Progress.toString(Progress.java:116)
        - locked <0xd4e30958> (a org.apache.hadoop.util.Progress)
        at org.apache.hadoop.util.Progress.toString(Progress.java:108)
        at org.apache.hadoop.mapred.Task$1.run(Task.java:268)
        at java.lang.Thread.run(Thread.java:619)
"main":
        at org.apache.hadoop.util.Progress.startNextPhase(Progress.java:58)
        - waiting to lock <0xd4e30958> (a org.apache.hadoop.util.Progress)
        at org.apache.hadoop.util.Progress.complete(Progress.java:70)
        - locked <0xd4e30aa8> (a org.apache.hadoop.util.Progress)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:253)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)
{noformat} 

  was:
Saw one reduce task stuck on copy.
jstack on the reduce task(task_200709272248_0001_r_000150_0)  process showed 

Found one Java-level deadlock:
=============================
"Comm thread for task_200709272248_0001_r_000150_0":
  waiting to lock monitor 0x08144020 (object 0xd4e30aa8, a org.apache.hadoop.util.Progress),
  which is held by "main"
"main":
  waiting to lock monitor 0x08144084 (object 0xd4e30958, a org.apache.hadoop.util.Progress),
  which is held by "Comm thread for task_200709272248_0001_r_000150_0"

Java stack information for the threads listed above:
===================================================
"Comm thread for task_200709272248_0001_r_000150_0":
        at org.apache.hadoop.util.Progress.toString(Progress.java:113)
        - waiting to lock <0xd4e30aa8> (a org.apache.hadoop.util.Progress)
        at org.apache.hadoop.util.Progress.toString(Progress.java:116)
        - locked <0xd4e30958> (a org.apache.hadoop.util.Progress)
        at org.apache.hadoop.util.Progress.toString(Progress.java:108)
        at org.apache.hadoop.mapred.Task$1.run(Task.java:268)
        at java.lang.Thread.run(Thread.java:619)
"main":
        at org.apache.hadoop.util.Progress.startNextPhase(Progress.java:58)
        - waiting to lock <0xd4e30958> (a org.apache.hadoop.util.Progress)
        at org.apache.hadoop.util.Progress.complete(Progress.java:70)
        - locked <0xd4e30aa8> (a org.apache.hadoop.util.Progress)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:253)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)



> tasktracker hang in reduce. Deadlock between main and comm thread
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1970
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1970
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.14.1
>            Reporter: Koji Noguchi
>            Priority: Blocker
>             Fix For: 0.14.2
>
>
> Saw one reduce task stuck on copy.
> jstack on the reduce task(task_200709272248_0001_r_000150_0)  process showed 
> {noformat} 
> Found one Java-level deadlock:
> =============================
> "Comm thread for task_200709272248_0001_r_000150_0":
>   waiting to lock monitor 0x08144020 (object 0xd4e30aa8, a org.apache.hadoop.util.Progress),
>   which is held by "main"
> "main":
>   waiting to lock monitor 0x08144084 (object 0xd4e30958, a org.apache.hadoop.util.Progress),
>   which is held by "Comm thread for task_200709272248_0001_r_000150_0"
> Java stack information for the threads listed above:
> ===================================================
> "Comm thread for task_200709272248_0001_r_000150_0":
>         at org.apache.hadoop.util.Progress.toString(Progress.java:113)
>         - waiting to lock <0xd4e30aa8> (a org.apache.hadoop.util.Progress)
>         at org.apache.hadoop.util.Progress.toString(Progress.java:116)
>         - locked <0xd4e30958> (a org.apache.hadoop.util.Progress)
>         at org.apache.hadoop.util.Progress.toString(Progress.java:108)
>         at org.apache.hadoop.mapred.Task$1.run(Task.java:268)
>         at java.lang.Thread.run(Thread.java:619)
> "main":
>         at org.apache.hadoop.util.Progress.startNextPhase(Progress.java:58)
>         - waiting to lock <0xd4e30958> (a org.apache.hadoop.util.Progress)
>         at org.apache.hadoop.util.Progress.complete(Progress.java:70)
>         - locked <0xd4e30aa8> (a org.apache.hadoop.util.Progress)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:253)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)
> {noformat} 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.