You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Sanjay Radia (JIRA)" <ji...@apache.org> on 2013/08/16 03:06:48 UTC

[jira] [Commented] (HADOOP-9880) RPC Server should not unconditionally create SaslServer with Token auth.

    [ https://issues.apache.org/jira/browse/HADOOP-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741771#comment-13741771 ] 

Sanjay Radia commented on HADOOP-9880:
--------------------------------------

We see exactly the same error during a test this morning.
The 2 Jiras that  caused this problem are the recent HADOOP-9421 and the earlier HDFS-3083.

HADOOP-9421 improved SASL protocol.
ZKFC uses Kerberos. But the server-side initiates the token-based challenge just in case the client wants token. As part of doing that the server does  secretManager.checkAvailableForRead()  fails because the NN is in standby. 

It is really bizzare that there is check for the server's state (active or standby) as part of SASL. This was introduced in HDFS-3083 to deal with a failover bug. In HDFS-3083, Aaron noted that he does not like the solution: "I'm not in love with this solution, as it leaks abstractions all over the place,". The abstraction layer violation finally caught up with us. 

Turns out even prior to Dary's HADOOP-9421 a similar problem could have occurred if the ZKFC had used Kerberos for first connection and Tokens for any subsequent connections.

An immediate fix is required to fix what HADOOP-9421 broke but I believe we need to also fix the fix that HDFS-3083 introduced - the abstraction layer violations need to be cleaned up.
                
> RPC Server should not unconditionally create SaslServer with Token auth.
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-9880
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9880
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.1.0-beta
>            Reporter: Kihwal Lee
>            Priority: Blocker
>
> buildSaslNegotiateResponse() will create a SaslRpcServer with TOKEN auth. When create() is called against it, secretManager.checkAvailableForRead() is called, which fails in HA standby. Thus HA standby nodes cannot be transitioned to active.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira