You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Omkar Vinit Joshi (JIRA)" <ji...@apache.org> on 2013/08/29 02:51:52 UTC

[jira] [Commented] (YARN-1107) Restart secure RM with recovery enabled while oozie jobs are running causes the RM to fail during startup

    [ https://issues.apache.org/jira/browse/YARN-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753129#comment-13753129 ] 

Omkar Vinit Joshi commented on YARN-1107:
-----------------------------------------

The underlying problem is in below code. Here we are bypassing the rpc call if it is a local call. However we were updating the (localServiceAddress & localSecretManager) in ClientRMService.startService call. To fix this we are doing this updation call inside serviceInit. Now here we are making a reasonable assumption that rm port will be static (specified in config ..specifically port).

{code}
    private static ApplicationClientProtocol getRmClient(Token<?> token,
        Configuration conf) {
      InetSocketAddress addr = SecurityUtil.getTokenServiceAddr(token);
      if (localSecretManager != null) {
        // return null if it's our token
        if (localServiceAddress.getAddress().isAnyLocalAddress()) {
            if (NetUtils.isLocalAddress(addr.getAddress()) &&
                addr.getPort() == localServiceAddress.getPort()) {
              return null;
            }
        } else if (addr.equals(localServiceAddress)) {
          return null;
        }
      }
      final YarnRPC rpc = YarnRPC.create(conf);
      return (ApplicationClientProtocol)rpc.getProxy(ApplicationClientProtocol.class, addr, conf);        
    }
{code}
                
> Restart secure RM with recovery enabled while oozie jobs are running causes the RM to fail during startup
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-1107
>                 URL: https://issues.apache.org/jira/browse/YARN-1107
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.1.0-beta
>            Reporter: Arpit Gupta
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Blocker
>         Attachments: rm.log
>
>
> If secure RM with recovery enabled is restarted while oozie jobs are running rm fails to come up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira