You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Rohith (JIRA)" <ji...@apache.org> on 2014/12/03 15:18:12 UTC

[jira] [Commented] (YARN-2917) RM get hanged if fail to store NodeLabels into store.

    [ https://issues.apache.org/jira/browse/YARN-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233023#comment-14233023 ] 

Rohith commented on YARN-2917:
------------------------------

Attaching thread dump when RM hanged
{code}
"Thread-1" prio=10 tid=0x00000000006e1000 nid=0x55a4 in Object.wait() [0x00007f2ce9493000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000f26b0d48> (a java.lang.Object)
	at org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:141)
	- locked <0x00000000f26b0d48> (a java.lang.Object)
	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
	- locked <0x00000000f26b0aa8> (a java.lang.Object)
	at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.stopDispatcher(CommonNodeLabelsManager.java:232)
	at org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.serviceStop(CommonNodeLabelsManager.java:238)
	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
	- locked <0x00000000f26b0968> (a java.lang.Object)
	at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
	at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
	at org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
	at org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:599)
	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
	- locked <0x00000000f2842458> (a java.lang.Object)
	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stopActiveServices(ResourceManager.java:1002)
	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:1057)
	- locked <0x00000000c0c96c98> (a org.apache.hadoop.yarn.server.resourcemanager.ResourceManager)
	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1104)
	at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
	- locked <0x00000000c0cab280> (a java.lang.Object)
	at org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
	at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
	at org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:65)
	at org.apache.hadoop.service.CompositeService$CompositeServiceShutdownHook.run(CompositeService.java:183)
	at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

"AsyncDispatcher event handler" daemon prio=10 tid=0x00007f2cf0b81000 nid=0x54a1 in Object.wait() [0x00007f2cf7bfa000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000c01b83e8> (a org.apache.hadoop.util.ShutdownHookManager$1)
	at java.lang.Thread.join(Thread.java:1281)
	- locked <0x00000000c01b83e8> (a org.apache.hadoop.util.ShutdownHookManager$1)
	at java.lang.Thread.join(Thread.java:1355)
	at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
	at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
	at java.lang.Shutdown.runHooks(Shutdown.java:123)
	at java.lang.Shutdown.sequence(Shutdown.java:167)
	at java.lang.Shutdown.exit(Shutdown.java:212)
	- locked <0x00000000c04ae9c0> (a java.lang.Class for java.lang.Shutdown)
	at java.lang.Runtime.exit(Runtime.java:109)
	at java.lang.System.exit(System.java:962)
	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:185)
	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
	at java.lang.Thread.run(Thread.java:745)
{code}

> RM get hanged if fail to store NodeLabels into store.
> -----------------------------------------------------
>
>                 Key: YARN-2917
>                 URL: https://issues.apache.org/jira/browse/YARN-2917
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>            Reporter: Rohith
>            Assignee: Rohith
>            Priority: Critical
>
> I encoutered scenario where RM hanged while shutting down and keep on logging {{2014-12-03 19:32:44,283 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Waiting for AsyncDispatcher to drain.}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)