You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Daniel Wong (Jira)" <ji...@apache.org> on 2021/03/08 21:15:00 UTC

[jira] [Created] (ZOOKEEPER-4235) Java Client SendThread does not clean up created objects during constructor of SaslClient and Login.

Daniel Wong created ZOOKEEPER-4235:
--------------------------------------

             Summary: Java Client SendThread does not clean up created objects during constructor of SaslClient and Login. 
                 Key: ZOOKEEPER-4235
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4235
             Project: ZooKeeper
          Issue Type: Bug
          Components: java client
            Reporter: Daniel Wong


Hi I am an Apache Phoenix committer and I help manage many many zookeeper clusters at my employment primarily using ZK for HBase use cases.  We recently had a production incident where some of our ACLs were not setup preventing connectivity from the client to the ZK nodes and the failure path exposed 2 issues to fix. This Jira and ZooKeeper-XXXX (TBD) .  This Jira is the more important of the 2 and handles the failure observed in that we had a FD/thread leak from the ZK java client send thread.  We had hundreds of threads per JVM with the following stack trace.


{code:java}
java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketConnect(java.base@11.0.4.0.101/Native Method) at java.net.AbstractPlainSocketImpl.doConnect(java.base@11.0.4.0.101/AbstractPlainSocketImpl.java:399) - locked <0x00000015004fde20> (a java.net.SocksSocketImpl) at java.net.AbstractPlainSocketImpl.connectToAddress(java.base@11.0.4.0.101/AbstractPlainSocketImpl.java:242) at java.net.AbstractPlainSocketImpl.connect(java.base@11.0.4.0.101/AbstractPlainSocketImpl.java:224) at java.net.SocksSocketImpl.connect(java.base@11.0.4.0.101/SocksSocketImpl.java:403) at java.net.Socket.connect(java.base@11.0.4.0.101/Socket.java:609) at sun.security.krb5.internal.TCPClient.<init>(java.security.jgss@11.0.4.0.101/NetClient.java:62) at sun.security.krb5.internal.NetClient.getInstance(java.security.jgss@11.0.4.0.101/NetClient.java:42) at sun.security.krb5.KdcComm$KdcCommunication.run(java.security.jgss@11.0.4.0.101/KdcComm.java:401) at sun.security.krb5.KdcComm$KdcCommunication.run(java.security.jgss@11.0.4.0.101/KdcComm.java:364) at java.security.AccessController.doPrivileged(java.base@11.0.4.0.101/Native Method) at sun.security.krb5.KdcComm.send(java.security.jgss@11.0.4.0.101/KdcComm.java:348) at sun.security.krb5.KdcComm.sendIfPossible(java.security.jgss@11.0.4.0.101/KdcComm.java:253) at sun.security.krb5.KdcComm.send(java.security.jgss@11.0.4.0.101/KdcComm.java:234) at sun.security.krb5.KdcComm.send(java.security.jgss@11.0.4.0.101/KdcComm.java:200) at sun.security.krb5.KrbAsReqBuilder.send(java.security.jgss@11.0.4.0.101/KrbAsReqBuilder.java:326) at sun.security.krb5.KrbAsReqBuilder.action(java.security.jgss@11.0.4.0.101/KrbAsReqBuilder.java:371) at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(jdk.security.auth@11.0.4.0.101/Krb5LoginModule.java:754) at com.sun.security.auth.module.Krb5LoginModule.login(jdk.security.auth@11.0.4.0.101/Krb5LoginModule.java:592) at javax.security.auth.login.LoginContext.invoke(java.base@11.0.4.0.101/LoginContext.java:726) at javax.security.auth.login.LoginContext$4.run(java.base@11.0.4.0.101/LoginContext.java:665) at javax.security.auth.login.LoginContext$4.run(java.base@11.0.4.0.101/LoginContext.java:663) at java.security.AccessController.doPrivileged(java.base@11.0.4.0.101/Native Method) at javax.security.auth.login.LoginContext.invokePriv(java.base@11.0.4.0.101/LoginContext.java:663) at javax.security.auth.login.LoginContext.login(java.base@11.0.4.0.101/LoginContext.java:574) at org.apache.zookeeper.Login.login(Login.java:304) - locked <0x000000151c477148> (a org.apache.zookeeper.Login) at org.apache.zookeeper.Login.<init>(Login.java:106) at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslClient(ZooKeeperSaslClient.java:249) - locked <0x000000151c476f68> (a org.apache.zookeeper.client.ZooKeeperSaslClient) at org.apache.zookeeper.client.ZooKeeperSaslClient.<init>(ZooKeeperSaslClient.java:141) at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:972) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1031)
{code}
Note that today ZooKeeperSaslClient as well as Login both allocate resources in their constructors and thus cannot be cleaned up or interrupted via close/shutdown/disconnect of their parents due to still being a null object during initialization.  This leaves the thread/sockets at the mercy of the configured kdc retry/timeout configuration.

This Jira is intended to break the constructor and the initialization path into separate methods and properly clean up the resulting objects.

  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)