You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Hari Krishna Dara (JIRA)" <ji...@apache.org> on 2017/01/17 11:35:26 UTC

[jira] [Created] (ZOOKEEPER-2667) NPE in the patch for ZOOKEEPER-2139 when multiple connections are made

Hari Krishna Dara created ZOOKEEPER-2667:
--------------------------------------------

             Summary: NPE in the patch for ZOOKEEPER-2139 when multiple connections are made
                 Key: ZOOKEEPER-2667
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2667
             Project: ZooKeeper
          Issue Type: Bug
          Components: java client
    Affects Versions: 3.5.2, 3.6.0
            Reporter: Hari Krishna Dara


ZOOKEEPER-2139 added support for connecting to multiple ZK services, but this also introduced a bug that causes a cryptic NPE. The client sees the below sort of error messages:

{noformat}
Exception while trying to create SASL client: java.lang.NullPointerException
SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: saslClient failed to initialize properly: it's null.
Error while calling watcher
java.lang.NullPointerException
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:581)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:532)
        at org.apache.hadoop.hbase.zookeeper.PendingWatcher.process(PendingWatcher.java:40)
        at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:579)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:554)
{noformat}

The line at {{ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:581)}} points to the middle line below, where {{event.getState()}} is {{null}}:

{noformat}
private void connectionEvent(WatchedEvent event) {
    switch(event.getState()) {
       case SyncConnected:
{noformat}

However, the event's state is {{null}} because of a couple of other bugs, particularly an NPE that gets a mention in the log without a stacktrace. This first NPE causes an incorrect initialization of the event and results in the second NPE with the stacktrace.

The reason for the first NPE comes from this code in {{ZookeeperSaslClient}}:

{noformat}
            if (!initializedLogin) {
                ...
            }
            Subject subject = login.getSubject();
{noformat}

Before the patch for ZOOKEEPER-2139, both the {{login}} and {{initializedLogin}} were {{static}} fields of {{ZookeeperSaslClient}}. To support multiple ZK clients, the {{login}} field was changed from {{static}} to instance field, however the {{initializedLogin}} field was left as {{static}} field. Because of this, the subsequent attempts to connect to ZK think that the login doesn't need to be done and go ahead and blindly use the {{login}} variable which causes the NPE.

At the core, the fix is simply to change {{initializedLogin}} to instance variable, but we have made a few additional changes to improve the logging and handle state. I will attach a patch soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)