You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Wangda Tan (JIRA)" <ji...@apache.org> on 2014/11/05 21:04:34 UTC

[jira] [Commented] (YARN-2805) RM2 in HA setup tries to login using the RM1's kerberos principal

    [ https://issues.apache.org/jira/browse/YARN-2805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14198973#comment-14198973 ] 

Wangda Tan commented on YARN-2805:
----------------------------------

Had investigated this issue, this is caused by YARN-2795. As pointed by [~xgong], the correct behavior is, we should setup HA configurations before login.
Uploaded a fix for it. Have done some tests on a HA+security cluster.

Without the patch, one of RM will always fail to start.
With this patch, both RM can be start and one of them will go to standby state.
Tried to stop/start RMs, the active/standby transition is as expected.
Tried to submit MR job to the cluster, job can successfully completed.

Please kindly review.

Thanks,
Wangda

> RM2 in HA setup tries to login using the RM1's kerberos principal
> -----------------------------------------------------------------
>
>                 Key: YARN-2805
>                 URL: https://issues.apache.org/jira/browse/YARN-2805
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Arpit Gupta
>            Assignee: Wangda Tan
>            Priority: Blocker
>         Attachments: YARN-2805.1.patch
>
>
> {code}
> 2014-11-04 08:41:08,705 INFO  resourcemanager.ResourceManager (SignalLogger.java:register(91)) - registered UNIX signal handlers for [TERM, HUP, INT]
> 2014-11-04 08:41:10,636 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service ResourceManager failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to login
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to login
> 	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:211)
> 	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> 	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1229)
> Caused by: java.io.IOException: Login failure for rm/IP@EXAMPLE.COM from keytab /etc/security/keytabs/rm.service.keytab: javax.security.auth.login.LoginException: Unable to obtain password from user
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:935)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)