You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Jian Fang (JIRA)" <ji...@apache.org> on 2015/02/06 23:21:36 UTC
[jira] [Commented] (MAPREDUCE-5703) Job client gets failure though
RM side job execution result is FINISHED and SUCCEEDED
[ https://issues.apache.org/jira/browse/MAPREDUCE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310047#comment-14310047 ]
Jian Fang commented on MAPREDUCE-5703:
--------------------------------------
Hi Jian He, could you please response to this jira? We have seen this issue couple times in our production clusters. You could give us a hint on what would be the best way to resolve this issue and then we can work on a fix for this. Thanks in advance.
> Job client gets failure though RM side job execution result is FINISHED and SUCCEEDED
> -------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5703
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5703
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: client
> Reporter: Ashutosh Jindal
> Priority: Critical
>
> 1) Run MR job
> 2) After reduce completed and while JHS file writing, restart DN.
> RM side job is shown as successful.
> JHS doesnt have info about the job.
> Job client gets NPE and exit code as 255.
> java.io.IOException: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException
> at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getTaskAttemptCompletionEvents(HistoryClientService.java:269)
> at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBServiceImpl.java:173)
> at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:283)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:929)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2080)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2076)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2074)
> at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:330)
> at org.apache.hadoop.mapred.ClientServiceDelegate.getTaskCompletionEvents(ClientServiceDelegate.java:382)
> at org.apache.hadoop.mapred.YARNRunner.getTaskCompletionEvents(YARNRunner.java:529)
> at org.apache.hadoop.mapreduce.Job$5.run(Job.java:668)
> at org.apache.hadoop.mapreduce.Job$5.run(Job.java:665)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:665)
> at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1349)
> at org.apache.hadoop.mapred.JobClient$NetworkedJob.monitorAndPrintJob(JobClient.java:407)
> at org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:855)
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:835)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)