You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Zian Chen (JIRA)" <ji...@apache.org> on 2018/07/17 04:50:00 UTC

[jira] [Commented] (YARN-8522) Application fails with InvalidResourceRequestException

    [ https://issues.apache.org/jira/browse/YARN-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16546015#comment-16546015 ] 

Zian Chen commented on YARN-8522:
---------------------------------

Root cause for this issue is with proxy retry, this piece of code inside ApplicationSubmissionContextPBImpl#mergeLocalToBuilder will be called twice which will make duplicate resource request. Need to make ApplicationSubmissionContextPBImpl#getProto synchronized to avoid this race condition.  
{code:java}
if (this.amResourceRequests != null) {
  builder.clearAmContainerResourceRequest();
  builder.addAllAmContainerResourceRequest(
      convertToProtoFormat(this.amResourceRequests));
}
{code}

> Application fails with InvalidResourceRequestException
> ------------------------------------------------------
>
>                 Key: YARN-8522
>                 URL: https://issues.apache.org/jira/browse/YARN-8522
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Yesha Vora
>            Assignee: Zian Chen
>            Priority: Major
>
> Launch multiple streaming app simultaneously. Here, sometimes one of the application fails with below stack trace.
> {code}
> 18/07/02 07:14:32 INFO retry.RetryInvocationHandler: java.net.ConnectException: Call From xx.xx.xx.xx/xx.xx.xx.xx to xx.xx.xx.xx:8032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused, while invoking ApplicationClientProtocolPBClientImpl.submitApplication over null. Retrying after sleeping for 30000ms.
> 18/07/02 07:14:32 WARN client.RequestHedgingRMFailoverProxyProvider: Invocation returned exception: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, only one resource request with * is allowed
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320)
>         at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645)
>         at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277)
>         at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
>  on [rm2], so propagating back to caller.
> 18/07/02 07:14:32 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/hrt_qa/.staging/job_1530515284077_0007
> 18/07/02 07:14:32 ERROR streaming.StreamJob: Error Launching job : org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, only one resource request with * is allowed
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:502)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:389)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:320)
>         at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:645)
>         at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:277)
>         at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:563)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> Streaming Command Failed!{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org