You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Bibin A Chundatt (JIRA)" <ji...@apache.org> on 2015/06/15 10:56:01 UTC
[jira] [Commented] (YARN-3804) Both RM are on standBy state when
kerberos user not in yarn.admin.acl
[ https://issues.apache.org/jira/browse/YARN-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585660#comment-14585660 ]
Bibin A Chundatt commented on YARN-3804:
----------------------------------------
Can we check for AccessControlException in {{ActiveStandbyElector#becomeActive()}} send event to shutdown ?
> Both RM are on standBy state when kerberos user not in yarn.admin.acl
> ---------------------------------------------------------------------
>
> Key: YARN-3804
> URL: https://issues.apache.org/jira/browse/YARN-3804
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Environment: Suse 11 Sp3, 2 RM, Secure
> Reporter: Bibin A Chundatt
>
> Steps to reproduce
> ================
> 1. Configure cluster in secure mode
> 2. On RM Configure yarn.admin.acl=dsperf
> 3. Configure in arn.resourcemanager.principal=yarn
> 4. Start Both RM
> Both RM will be in Standby forever
> {code}
> 2015-06-15 12:20:21,556 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn OPERATION=refreshAdminAcls TARGET=AdminService RESULT=FAILURE DESCRIPTION=Unauthorized userPERMISSIONS=
> 2015-06-15 12:20:21,556 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
> at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:128)
> at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:824)
> at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:420)
> at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:645)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:518)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Can not execute refreshAdminAcls
> at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:297)
> at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:126)
> ... 4 more
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: org.apache.hadoop.security.AccessControlException: User yarn doesn't have permission to call 'refreshAdminAcls'
> at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
> at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAcls(AdminService.java:230)
> at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAdminAcls(AdminService.java:465)
> at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:295)
> ... 5 more
> Caused by: org.apache.hadoop.security.AccessControlException: User yarn doesn't have permission to call 'refreshAdminAcls'
> at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:182)
> at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:148)
> at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAccess(AdminService.java:223)
> at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAcls(AdminService.java:228)
> ... 7 more
> {code}
> *Analysis*
> On each RM attempt to switch to Active refreshACl is called and acl permission not available for the user
> Infinite retry for the same switch to Active and always false returned from
> {{ActiveStandbyElector#becomeActive()}}
>
> *Expected*
> RM should get shutdown event after few retry or even at first attempt
> Since at runtime user from which it retries for refreshacl can never be updated.
> *States from commands*
> ./yarn rmadmin -getServiceState rm2
> *standby*
> ./yarn rmadmin -getServiceState rm1
> *standby*
> ./yarn rmadmin -checkHealth rm1
> *echo $? = 0*
> ./yarn rmadmin -checkHealth rm2
> *echo $? = 0*
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)