You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Hua xu (JIRA)" <ji...@apache.org> on 2012/04/21 09:30:33 UTC

[jira] [Created] (HADOOP-8300) the TaskMemoryManager thread is not interrupt when the TaskTracker is oedered to reinit by JobTracker

Hua xu created HADOOP-8300:
------------------------------

             Summary: the TaskMemoryManager thread is not interrupt when the TaskTracker is oedered to reinit by JobTracker
                 Key: HADOOP-8300
                 URL: https://issues.apache.org/jira/browse/HADOOP-8300
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 0.20.2
            Reporter: Hua xu


When the TaskTracker is oedered to reinit by JobTracker, it will stop interrupt some threads and then reinit them, but TaskTracker does not interrupt  TaskMemoryManager thread and create a new TaskMemoryManager thread again.
I use the tool--jstack to find that:

Full thread dump Java HotSpot(TM) Server VM (1.6.0-b105 mixed mode):

"IPC Client (47) connection to /*:8012 from xhh" daemon prio=10 tid=0x083d2c00 nid=0x25b4 in Object.wait() [0x70852000..0x70852ea0]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xadbdb7c0> (a org.apache.hadoop.ipc.Client$Connection)
        at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:414)
        - locked <0xadbdb7c0> (a org.apache.hadoop.ipc.Client$Connection)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:456)

"TaskLauncher for task" daemon prio=10 tid=0x082f2c00 nid=0x25b3 in Object.wait() [0x6fc63000..0x6fc63f20]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xadbcff48> (a java.util.LinkedList)
        at java.lang.Object.wait(Object.java:485)
        at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1608)
        - locked <0xadbcff48> (a java.util.LinkedList)

"TaskLauncher for task" daemon prio=10 tid=0x082f2000 nid=0x25b2 in Object.wait() [0x708f4000..0x708f4fa0]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xadbcfd00> (a java.util.LinkedList)
        at java.lang.Object.wait(Object.java:485)
        at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1608)
        - locked <0xadbcfd00> (a java.util.LinkedList)

"org.apache.hadoop.mapred.TaskMemoryManagerThread" daemon prio=10 tid=0x082f1400 nid=0x25b1 waiting on condition [0x6fb47000..0x6fb48020]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:241)

"Map-events fetcher on tracker_*:*/*:50070" daemon prio=10 tid=0x082f1000 nid=0x25b0 in Object.wait() [0x708a3000..0x708a40a0]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xaf1227b0> (a java.util.TreeMap)
        at java.lang.Object.wait(Object.java:485)
        at org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.run(TaskTracker.java:592)
        - locked <0xaf1227b0> (a java.util.TreeMap)

"IPC Client (47) connection to /* from xhh" daemon prio=10 tid=0x082f9000 nid=0x25af in Object.wait() [0x6fc12000..0x6fc13120]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xadb36b80> (a org.apache.hadoop.ipc.Client$Connection)
        at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:414)
        - locked <0xadb36b80> (a org.apache.hadoop.ipc.Client$Connection)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:456)

"IPC Server handler[3] on 50070" daemon prio=10 tid=0x083a3400 nid=0x25ae waiting on condition [0x6fd0b000..0x6fd0bda0]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xaf121468> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1889)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:963)

"IPC Server handler[2] on 50070" daemon prio=10 tid=0x0847b400 nid=0x25ad waiting on condition [0x6fcba000..0x6fcbae20]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xaf121468> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1889)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:963)

"IPC Server handler[1] on 50070" daemon prio=10 tid=0x08392c00 nid=0x25ac waiting on condition [0x6fdfe000..0x6fdfeea0]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xaf121468> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1889)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:963)

"IPC Server handler[0] on 50070" daemon prio=10 tid=0x0846f400 nid=0x25ab waiting on condition [0x6fdad000..0x6fdadf20]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xaf121468> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1889)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:963)

"IPC Server listener on 50070" daemon prio=10 tid=0x08345400 nid=0x25aa runnable [0x7017f000..0x7017ffa0]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:184)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
        - locked <0xaf121950> (a sun.nio.ch.Util$1)
        - locked <0xaf121940> (a java.util.Collections$UnmodifiableSet)
        - locked <0xaf121740> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
        at org.apache.hadoop.ipc.Server$Listener.run(Server.java:314)

"IPC Server Responder" daemon prio=10 tid=0x0836b000 nid=0x25a9 runnable [0x6fd5c000..0x6fd5d020]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:184)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
        - locked <0xaf121f80> (a sun.nio.ch.Util$1)
        - locked <0xaf121f70> (a java.util.Collections$UnmodifiableSet)
        - locked <0xaf121d88> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
        at org.apache.hadoop.ipc.Server$Responder.run(Server.java:483)

"org.apache.hadoop.mapred.TaskMemoryManagerThread" daemon prio=10 tid=0x08458000 nid=0x259d waiting on condition [0x6fa03000..0x6fa03f20]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:241)

"Attach Listener" daemon prio=10 tid=0x0845a800 nid=0x2593 runnable [0x00000000..0x00000000]
   java.lang.Thread.State: RUNNABLE

"org.apache.hadoop.mapred.TaskMemoryManagerThread" daemon prio=10 tid=0x0836f400 nid=0x257c waiting on condition [0x6faf6000..0x6faf6ea0]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:241)

"Directory/File cleanup thread" daemon prio=10 tid=0x0845d000 nid=0x256b waiting on condition [0x6fa54000..0x6fa550a0]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xaf10a1d8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1889)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
        at org.apache.hadoop.mapred.CleanupQueue$PathCleanupThread.run(CleanupQueue.java:88)

"taskCleanup" daemon prio=10 tid=0x0845bc00 nid=0x256a waiting on condition [0x6faa5000..0x6faa6120]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0xaf10a088> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1889)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
        at org.apache.hadoop.mapred.TaskTracker$1.run(TaskTracker.java:310)
        at java.lang.Thread.run(Thread.java:619)

"org.apache.hadoop.mapred.TaskMemoryManagerThread" daemon prio=10 tid=0x083c5000 nid=0x2567 waiting on condition [0x6fb98000..0x6fb98ea0]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.mapred.TaskMemoryManagerThread.run(TaskMemoryManagerThread.java:241)

"Timer-0" daemon prio=10 tid=0x702c7800 nid=0x255a in Object.wait() [0x70390000..0x70390f20]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xaf10a3a8> (a java.util.TaskQueue)
        at java.util.TimerThread.mainLoop(Timer.java:509)
        - locked <0xaf10a3a8> (a java.util.TaskQueue)
        at java.util.TimerThread.run(Timer.java:462)

"14871751@qtp0-0 - Acceptor0 SelectChannelConnector@*3:50060" prio=10 tid=0x082be400 nid=0x2559 runnable [0x701fe000..0x701fefa0]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:184)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
        - locked <0xaf119da0> (a sun.nio.ch.Util$1)
        - locked <0xaf119db0> (a java.util.Collections$UnmodifiableSet)
        - locked <0xaf119d60> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
        at org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:429)
        at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:185)
        at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
        at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:707)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)

"Low Memory Detector" daemon prio=10 tid=0x08187400 nid=0x254a runnable [0x00000000..0x00000000]
   java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x08185c00 nid=0x2549 waiting on condition [0x00000000..0x70a474b8]
   java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x08184800 nid=0x2548 waiting on condition [0x00000000..0x70ac8538]
   java.lang.Thread.State: RUNNABLE

"JDWP Command Reader" daemon prio=10 tid=0x08177400 nid=0x2545 runnable [0x00000000..0x00000000]
   java.lang.Thread.State: RUNNABLE

"JDWP Event Helper Thread" daemon prio=10 tid=0x08175c00 nid=0x2542 runnable [0x00000000..0x00000000]
   java.lang.Thread.State: RUNNABLE

"JDWP Transport Listener: dt_socket" daemon prio=10 tid=0x08173800 nid=0x2541 runnable [0x00000000..0x70bbbe70]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x0816bc00 nid=0x253f runnable [0x00000000..0x70c10a80]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x08152800 nid=0x253e in Object.wait() [0x70c6f000..0x70c6fe20]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xaf11a950> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
        - locked <0xaf11a950> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x08151c00 nid=0x253d in Object.wait() [0x70cc0000..0x70cc0ea0]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xaf11a970> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:485)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
        - locked <0xaf11a970> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x0806b400 nid=0x2536 at breakpoint[0xb7fdf000..0xb7fe01f8]
   java.lang.Thread.State: RUNNABLE
        at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1096)
        at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1727)
        at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3093)

"VM Thread" prio=10 tid=0x0814f400 nid=0x253c runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x08072000 nid=0x2538 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x08073000 nid=0x2539 runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x08074000 nid=0x253a runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x08075400 nid=0x253b runnable

"VM Periodic Task Thread" prio=10 tid=0x08188c00 nid=0x254b waiting on condition

JNI global references: 3317


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira