You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Tarun Parimi (Jira)" <ji...@apache.org> on 2020/04/20 10:48:00 UTC
[jira] [Created] (YARN-10240) Prevent Fatal CancelledException in
TimelineV2Client when stopping
Tarun Parimi created YARN-10240:
-----------------------------------
Summary: Prevent Fatal CancelledException in TimelineV2Client when stopping
Key: YARN-10240
URL: https://issues.apache.org/jira/browse/YARN-10240
Project: Hadoop YARN
Issue Type: Bug
Components: ATSv2
Reporter: Tarun Parimi
When the timeline client is stopped, it will cancel all sync EntityHolders after waiting for a drain timeout.
{code:java}
// if some entities were not drained then we need interrupt
// the threads which had put sync EntityHolders to the queue.
EntitiesHolder nextEntityInTheQueue = null;
while ((nextEntityInTheQueue =
timelineEntityQueue.poll()) != null) {
nextEntityInTheQueue.cancel(true);
}
{code}
We only handle interrupted exception here.
{code:java}
if (sync) {
// In sync call we need to wait till its published and if any error then
// throw it back
try {
entitiesHolder.get();
} catch (ExecutionException e) {
throw new YarnException("Failed while publishing entity",
e.getCause());
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new YarnException("Interrupted while publishing entity", e);
}
}
{code}
But calling nextEntityInTheQueue.cancel(true) will result in entitiesHolder.get() throwing a CancelledException which is not handled. This can result in FATAL error in NM. We need to prevent this.
{code:java}
FATAL event.AsyncDispatcher (AsyncDispatcher.java:dispatch(203)) - Error in dispatcher thread
java.util.concurrent.CancellationException
at java.util.concurrent.FutureTask.report(FutureTask.java:121)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher.dispatchEntities(TimelineV2ClientImpl.java:545)
at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putEntities(TimelineV2ClientImpl.java:149)
at org.apache.hadoop.yarn.server.nodemanager.timelineservice.NMTimelinePublisher.putEntity(NMTimelinePublisher.java:348)
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org