You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Bibin A Chundatt (JIRA)" <ji...@apache.org> on 2015/06/15 10:55:00 UTC

[jira] [Created] (YARN-3804) Both RM are on standBy state when kerberos user not in yarn.admin.acl

Bibin A Chundatt created YARN-3804:
--------------------------------------

             Summary: Both RM are on standBy state when kerberos user not in yarn.admin.acl
                 Key: YARN-3804
                 URL: https://issues.apache.org/jira/browse/YARN-3804
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
         Environment: Suse 11 Sp3, 2 RM, Secure
            Reporter: Bibin A Chundatt


Steps to reproduce
================
1. Configure cluster in secure mode
2. On  RM Configure yarn.admin.acl=dsperf
3. Configure in arn.resourcemanager.principal=yarn
4. Start Both RM 

Both RM will be in Standby forever

{code}

2015-06-15 12:20:21,556 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn     OPERATION=refreshAdminAcls      TARGET=AdminService     RESULT=FAILURE  DESCRIPTION=Unauthorized userPERMISSIONS=
2015-06-15 12:20:21,556 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election
org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
        at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:128)
        at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:824)
        at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:420)
        at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:645)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:518)
Caused by: org.apache.hadoop.ha.ServiceFailedException: Can not execute refreshAdminAcls
        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:297)
        at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:126)
        ... 4 more
Caused by: org.apache.hadoop.yarn.exceptions.YarnException: org.apache.hadoop.security.AccessControlException: User yarn doesn't have permission to call 'refreshAdminAcls'
        at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAcls(AdminService.java:230)
        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAdminAcls(AdminService.java:465)
        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:295)
        ... 5 more
Caused by: org.apache.hadoop.security.AccessControlException: User yarn doesn't have permission to call 'refreshAdminAcls'
        at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:182)
        at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:148)
        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAccess(AdminService.java:223)
        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAcls(AdminService.java:228)
        ... 7 more
{code}



*Analysis*

On each RM attempt to switch to Active refreshACl is called and acl permission not available for the user
Infinite retry for the same switch to Active and always false returned from 
{{ActiveStandbyElector#becomeActive()}}
 

*Expected*

RM should get shutdown event after few retry or even at first attempt
Since at runtime user from which it retries for refreshacl can never be updated.

*States from commands*

 ./yarn rmadmin -getServiceState rm2
*standby*
 ./yarn rmadmin -getServiceState rm1
*standby*

 ./yarn rmadmin -checkHealth rm1
*echo $? = 0*
 ./yarn rmadmin -checkHealth rm2
*echo $? = 0*




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)