You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Tao Yang (JIRA)" <ji...@apache.org> on 2018/03/29 04:50:00 UTC

[jira] [Created] (YARN-8085) RMContext#resourceProfilesManager is lost after RM went standby then back to active

Tao Yang created YARN-8085:
------------------------------

             Summary: RMContext#resourceProfilesManager is lost after RM went standby then back to active
                 Key: YARN-8085
                 URL: https://issues.apache.org/jira/browse/YARN-8085
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacityscheduler
    Affects Versions: 3.2.0
            Reporter: Tao Yang
            Assignee: Tao Yang


We submited a distributed shell application after RM failover and back to active, then got NPE error in RM log:
{noformat}
java.lang.NullPointerException
        at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getResourceProfiles(ClientRMService.java:1814)
        at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getResourceProfiles(ApplicationClientProtocolPBServiceImpl.java:657)
        at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:617)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
{noformat}

The cause is that currently resourceProfilesManager is not transferred to new RMContext instance in RMContext#resetRMContext. We should do this transfer to fix this error.
{code:java}
@@ -1488,6 +1488,10 @@ private void resetRMContext() {
     // transfer service context to new RM service Context
     rmContextImpl.setServiceContext(rmContext.getServiceContext());

+    // transfer resource profiles manager
+    rmContextImpl
+        .setResourceProfilesManager(rmContext.getResourceProfilesManager());
+
     // reset dispatcher
     Dispatcher dispatcher = setupDispatcher();
     ((Service) dispatcher).init(this.conf);
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org