You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Jerry Chen (JIRA)" <ji...@apache.org> on 2012/12/11 08:53:23 UTC
[jira] [Commented] (MAPREDUCE-4816) JobImpl Invalid event:
JOB_TASK_ATTEMPT_COMPLETED at FAILED
[ https://issues.apache.org/jira/browse/MAPREDUCE-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528775#comment-13528775 ]
Jerry Chen commented on MAPREDUCE-4816:
---------------------------------------
Hi Jason, I looked into this issue. The cause of the exception is that when a job is at a FAILED state, it should ignore further JOB_TASK_ATTEMPT_COMPLETED events. While in the version 0.23.5, the trasition from FAILED state with JOB_TASK_ATTEMPT_COMPLETED event is not declared in its state machine and thus throws the exception.
Actually, besides JOB_TASK_ATTEMPT_COMPLETED, other events such as JOB_TASK_COMPLETED, JOB_MAP_TASK_RESCHEDULED also possibly to happen at the FAILED state and should also be declared in the state machine.
I checked the trunk version. And there seems to be some refactor done with the JobState -> JobStateInternal and already fix the transition problem mentioned above.
If necessary, I would back port the trasition fixes to these versions such as 2.0.2-alpha.
> JobImpl Invalid event: JOB_TASK_ATTEMPT_COMPLETED at FAILED
> -----------------------------------------------------------
>
> Key: MAPREDUCE-4816
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4816
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: applicationmaster
> Affects Versions: 0.23.5
> Reporter: Jason Lowe
>
> Saw this in an AM log of a task that had failed:
> {noformat}
> 2012-11-21 23:26:44,533 ERROR [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: JOB_TASK_ATTEMPT_COMPLETED at FAILED
> at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
> at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:690)
> at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:113)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:904)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:900)
> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
> at java.lang.Thread.run(Thread.java:619)
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira