You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Eric Badger (JIRA)" <ji...@apache.org> on 2017/08/28 18:28:00 UTC
[jira] [Created] (YARN-7114) NM can fail during shutdown with log
aggregation
Eric Badger created YARN-7114:
---------------------------------
Summary: NM can fail during shutdown with log aggregation
Key: YARN-7114
URL: https://issues.apache.org/jira/browse/YARN-7114
Project: Hadoop YARN
Issue Type: Bug
Affects Versions: 2.8.1
Reporter: Eric Badger
{noformat}
2017-08-24 16:36:35,961 [AsyncDispatcher event handler] WARN application.ApplicationImpl: Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at FINISHING_CONTAINERS_WAIT
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:458)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:63)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1314)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1306)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
at java.lang.Thread.run(Thread.java:745)
2017-08-24 16:36:35,962 [AsyncDispatcher event handler] INFO application.ApplicationImpl: Application application_1502220952225_46598 transitioned from FINISHING_CONTAINERS_WAIT to null
2017-08-24 16:36:36,056 [AsyncDispatcher event handler] WARN application.ApplicationImpl: Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at FINISHING_CONTAINERS_WAIT
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:458)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:63)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1314)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1306)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
at java.lang.Thread.run(Thread.java:745)
{noformat}
This was caused by doing an RM restart that increased its version. The NM's version was unchanged and so it was kicked out of the cluster during registration. The NM then did log aggregation and failed when it finished, since log aggregation was never called for (it was forced by the shutdown). The failure was seen in 2.8, but I believe that this problem also exists in 2.9 and trunk
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org