You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Jian Fang (JIRA)" <ji...@apache.org> on 2015/02/06 23:21:36 UTC

[jira] [Commented] (MAPREDUCE-5703) Job client gets failure though RM side job execution result is FINISHED and SUCCEEDED

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310047#comment-14310047 ] 

Jian Fang commented on MAPREDUCE-5703:
--------------------------------------

Hi Jian He, could you please response to this jira? We have seen this issue couple times in our production clusters. You could give us a hint on what would be the best way to resolve this issue and then we can work on a fix for this. Thanks in advance.


> Job client gets failure though RM side job execution result is FINISHED and SUCCEEDED
> -------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5703
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5703
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client
>            Reporter: Ashutosh Jindal
>            Priority: Critical
>
> 1) Run MR job 
> 2) After reduce completed and while JHS file writing, restart DN.
> RM side job is shown as successful.
> JHS doesnt have info about the job.
> Job client gets NPE and exit code as 255.
> java.io.IOException: org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): java.lang.NullPointerException
> 	at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getTaskAttemptCompletionEvents(HistoryClientService.java:269)
> 	at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBServiceImpl.java:173)
> 	at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:283)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:929)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2080)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2076)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2074)
> 	at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:330)
> 	at org.apache.hadoop.mapred.ClientServiceDelegate.getTaskCompletionEvents(ClientServiceDelegate.java:382)
> 	at org.apache.hadoop.mapred.YARNRunner.getTaskCompletionEvents(YARNRunner.java:529)
> 	at org.apache.hadoop.mapreduce.Job$5.run(Job.java:668)
> 	at org.apache.hadoop.mapreduce.Job$5.run(Job.java:665)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> 	at org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:665)
> 	at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1349)
> 	at org.apache.hadoop.mapred.JobClient$NetworkedJob.monitorAndPrintJob(JobClient.java:407)
> 	at org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:855)
> 	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:835)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)