You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Rohith Sharma K S (JIRA)" <ji...@apache.org> on 2017/06/07 06:22:18 UTC

[jira] [Created] (YARN-6695) Race condition in RM for publishing container events vs appFinished events causes NPE

Rohith Sharma K S created YARN-6695:
---------------------------------------

             Summary: Race condition in RM for publishing container events vs appFinished events causes NPE 
                 Key: YARN-6695
                 URL: https://issues.apache.org/jira/browse/YARN-6695
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Rohith Sharma K S


When RM publishes container events i.e by enabling *yarn.rm.system-metrics-publisher.emit-container-events*, there is race condition for processing events 
vs appFinished event that removes appId from collector list which cause NPE. 

Look at the below trace where appId is removed from collectors first and then corresponding events are processed. 
{noformat}
2017-06-06 19:28:48,896 INFO  capacity.ParentQueue (ParentQueue.java:removeApplication(472)) - Application removed - appId: application_1496758895643_0005 user: root leaf-queue of parent: root #applications: 0
2017-06-06 19:28:48,921 INFO  collector.TimelineCollectorManager (TimelineCollectorManager.java:remove(190)) - The collector service for application_1496758895643_0005 was removed
2017-06-06 19:28:48,922 ERROR metrics.TimelineServiceV2Publisher (TimelineServiceV2Publisher.java:putEntity(451)) - Error when publishing entity TimelineEntity[type='YARN_CONTAINER', id='container_e01_1496758895643_0005_01_000002']
java.lang.NullPointerException
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.putEntity(TimelineServiceV2Publisher.java:448)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.access$100(TimelineServiceV2Publisher.java:72)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:480)
	at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:469)
	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:201)
	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:127)
	at java.lang.Thread.run(Thread.java:745)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org