You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Akhil PB (JIRA)" <ji...@apache.org> on 2018/08/07 06:37:00 UTC
[jira] [Commented] (YARN-6695) Race condition in RM for publishing
container events vs appFinished events causes NPE
[ https://issues.apache.org/jira/browse/YARN-6695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16571174#comment-16571174 ]
Akhil PB commented on YARN-6695:
--------------------------------
NPE has thrown when tried to stop a service, which resulted in RM shutdown.
[~rohithsharma] [~sunilg] [~vrushalic]
{code:java}
2018-08-07 11:36:02,774 ERROR org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher: Error when publishing entity TimelineEntity[type='YARN_APPLICATION', id='application_1533536393859_0003']
2018-08-07 11:36:02,833 INFO org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager: The collector service for application_1533536393859_0003 was removed
2018-08-07 11:36:02,858 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
java.lang.NullPointerException
at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.putEntity(TimelineServiceV2Publisher.java:459)
at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.access$100(TimelineServiceV2Publisher.java:73)
at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:494)
at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:483)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748){code}
> Race condition in RM for publishing container events vs appFinished events causes NPE
> --------------------------------------------------------------------------------------
>
> Key: YARN-6695
> URL: https://issues.apache.org/jira/browse/YARN-6695
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Rohith Sharma K S
> Priority: Major
>
> When RM publishes container events i.e by enabling *yarn.rm.system-metrics-publisher.emit-container-events*, there is race condition for processing events
> vs appFinished event that removes appId from collector list which cause NPE.
> Look at the below trace where appId is removed from collectors first and then corresponding events are processed.
> {noformat}
> 2017-06-06 19:28:48,896 INFO capacity.ParentQueue (ParentQueue.java:removeApplication(472)) - Application removed - appId: application_1496758895643_0005 user: root leaf-queue of parent: root #applications: 0
> 2017-06-06 19:28:48,921 INFO collector.TimelineCollectorManager (TimelineCollectorManager.java:remove(190)) - The collector service for application_1496758895643_0005 was removed
> 2017-06-06 19:28:48,922 ERROR metrics.TimelineServiceV2Publisher (TimelineServiceV2Publisher.java:putEntity(451)) - Error when publishing entity TimelineEntity[type='YARN_CONTAINER', id='container_e01_1496758895643_0005_01_000002']
> java.lang.NullPointerException
> at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.putEntity(TimelineServiceV2Publisher.java:448)
> at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.access$100(TimelineServiceV2Publisher.java:72)
> at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:480)
> at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:469)
> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:201)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:127)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org