You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Miklos Szegedi (JIRA)" <ji...@apache.org> on 2018/01/02 18:30:07 UTC
[jira] [Commented] (MAPREDUCE-7028) Concurrent task progress
updates causing NPE in Application Master
[ https://issues.apache.org/jira/browse/MAPREDUCE-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308466#comment-16308466 ]
Miklos Szegedi commented on MAPREDUCE-7028:
-------------------------------------------
Thank you for the comment, [~jlowe]. Indeed the lock does not help in all cases.
Thank you for the patch [~grepas].
{code}
602 taskAttemptStatus.fetchFailedMaps.addAll(
603 taskAttemptStatus.fetchFailedMaps);
604 taskAttemptStatus.fetchFailedMaps.addAll(
605 lastStatus.fetchFailedMaps);
{code}
The time order of the two lists is the opposite, so I would reverse them. Also the asyncUpdateNeeded update can go outside the loop.
> Concurrent task progress updates causing NPE in Application Master
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-7028
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7028
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mr-am
> Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6
> Reporter: Gergo Repas
> Assignee: Gergo Repas
> Attachments: MAPREDUCE-7028.000.patch, MAPREDUCE-7028.001.patch, MAPREDUCE-7028.002.patch
>
>
> Concurrent task progress updates can cause a NullPointerException in the Application Master (stack trace is with code at current trunk):
> {quote}
> 2017-12-20 06:49:42,369 INFO [IPC Server handler 9 on 39501] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1513780867907_0001_m_000002_0 is : 0.02677883
> 2017-12-20 06:49:42,369 INFO [IPC Server handler 13 on 39501] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1513780867907_0001_m_000002_0 is : 0.02677883
> 2017-12-20 06:49:42,383 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
> java.lang.NullPointerException
> at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$StatusUpdater.transition(TaskAttemptImpl.java:2450)
> at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$StatusUpdater.transition(TaskAttemptImpl.java:2433)
> at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1362)
> at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:154)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1543)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1535)
> at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:748)
> 2017-12-20 06:49:42,385 INFO [IPC Server handler 13 on 39501] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1513780867907_0001_m_000002_0 is : 0.02677883
> 2017-12-20 06:49:42,386 INFO [AsyncDispatcher ShutDown handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
> {quote}
> This happened naturally in several big wordcount runs, and I could reproduce this reliably by artificially making task updates more frequent.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org