You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Sangjin Lee (JIRA)" <ji...@apache.org> on 2014/09/16 06:02:34 UTC

[jira] [Assigned] (MAPREDUCE-6091) YARNRunner.getJobStatus() fails with ApplicationNotFoundException if the job rolled off the RM view

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sangjin Lee reassigned MAPREDUCE-6091:
--------------------------------------

    Assignee: Sangjin Lee

> YARNRunner.getJobStatus() fails with ApplicationNotFoundException if the job rolled off the RM view
> ---------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6091
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6091
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.1.0-beta
>            Reporter: Sangjin Lee
>            Assignee: Sangjin Lee
>
> If you query the job status of a job that rolled off the RM view via YARNRunner.getJobStatus(), it fails with an ApplicationNotFoundException. For example,
> {noformat}
> 2014-09-15 07:09:51,084 ERROR org.apache.pig.tools.grunt.Grunt: ERROR 6017: JobID: job_1410289045532_90542 Reason: java.io.IOException: org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1410289045532_90542' doesn't exist in RM.
> 	at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:288)
> 	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:150)
> 	at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:337)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2058)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2054)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1547)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2052)
> 	at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:348)
> 	at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:419)
> 	at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:559)
> 	at org.apache.hadoop.mapreduce.Job$1.run(Job.java:314)
> 	at org.apache.hadoop.mapreduce.Job$1.run(Job.java:311)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1547)
> 	at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
> 	at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:599)
> 	at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.checkRunningState(ControlledJob.java:257)
> 	at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.checkState(ControlledJob.java:282)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.pig.backend.hadoop23.PigJobControl.checkState(PigJobControl.java:120)
> 	at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:180)
> 	at java.lang.Thread.run(Thread.java:662)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:279)
> {noformat}
> Prior to 2.1.0, it used to be able to fall back onto the job history server and get the status.
> This appears to be introduced by YARN-873. YARN-873 changed ClientRMService to throw an ApplicationNotFoundException on an unknown app id (from returning null). But MR's ClientServiceDelegate was never modified to change its behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)