You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Daniel Wong (Jira)" <ji...@apache.org> on 2021/03/08 21:31:00 UTC

[jira] [Created] (ZOOKEEPER-4236) Java Client SendThread create many unnecessary Login objects

Daniel Wong created ZOOKEEPER-4236:
--------------------------------------

             Summary: Java Client SendThread create many unnecessary Login objects
                 Key: ZOOKEEPER-4236
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4236
             Project: ZooKeeper
          Issue Type: Bug
            Reporter: Daniel Wong


Hi I am an Apache Phoenix committer and I help manage many many zookeeper clusters at my employment primarily using ZK for HBase use cases.  We recently had a production incident where some of our ACLs were not setup preventing connectivity from the client to the ZK nodes and the failure path exposed 2 issues to fix. This Jira and ZooKeeper-4235.  This Jira is the less important of the 2 and handles numerous objects.  We had hundreds of threads per JVM with the following stack trace.  
{code:java}
java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketConnect(java.base@11.0.4.0.101/Native Method) at java.net.AbstractPlainSocketImpl.doConnect(java.base@11.0.4.0.101/AbstractPlainSocketImpl.java:399) - locked <0x00000015004fde20> (a java.net.SocksSocketImpl) at java.net.AbstractPlainSocketImpl.connectToAddress(java.base@11.0.4.0.101/AbstractPlainSocketImpl.java:242) at java.net.AbstractPlainSocketImpl.connect(java.base@11.0.4.0.101/AbstractPlainSocketImpl.java:224) at java.net.SocksSocketImpl.connect(java.base@11.0.4.0.101/SocksSocketImpl.java:403) at java.net.Socket.connect(java.base@11.0.4.0.101/Socket.java:609) at sun.security.krb5.internal.TCPClient.<init>(java.security.jgss@11.0.4.0.101/NetClient.java:62) at sun.security.krb5.internal.NetClient.getInstance(java.security.jgss@11.0.4.0.101/NetClient.java:42) at sun.security.krb5.KdcComm$KdcCommunication.run(java.security.jgss@11.0.4.0.101/KdcComm.java:401) at sun.security.krb5.KdcComm$KdcCommunication.run(java.security.jgss@11.0.4.0.101/KdcComm.java:364) at java.security.AccessController.doPrivileged(java.base@11.0.4.0.101/Native Method) at sun.security.krb5.KdcComm.send(java.security.jgss@11.0.4.0.101/KdcComm.java:348) at sun.security.krb5.KdcComm.sendIfPossible(java.security.jgss@11.0.4.0.101/KdcComm.java:253) at sun.security.krb5.KdcComm.send(java.security.jgss@11.0.4.0.101/KdcComm.java:234) at sun.security.krb5.KdcComm.send(java.security.jgss@11.0.4.0.101/KdcComm.java:200) at sun.security.krb5.KrbAsReqBuilder.send(java.security.jgss@11.0.4.0.101/KrbAsReqBuilder.java:326) at sun.security.krb5.KrbAsReqBuilder.action(java.security.jgss@11.0.4.0.101/KrbAsReqBuilder.java:371) at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(jdk.security.auth@11.0.4.0.101/Krb5LoginModule.java:754) at com.sun.security.auth.module.Krb5LoginModule.login(jdk.security.auth@11.0.4.0.101/Krb5LoginModule.java:592) at javax.security.auth.login.LoginContext.invoke(java.base@11.0.4.0.101/LoginContext.java:726) at javax.security.auth.login.LoginContext$4.run(java.base@11.0.4.0.101/LoginContext.java:665) at javax.security.auth.login.LoginContext$4.run(java.base@11.0.4.0.101/LoginContext.java:663) at java.security.AccessController.doPrivileged(java.base@11.0.4.0.101/Native Method) at javax.security.auth.login.LoginContext.invokePriv(java.base@11.0.4.0.101/LoginContext.java:663) at javax.security.auth.login.LoginContext.login(java.base@11.0.4.0.101/LoginContext.java:574) at org.apache.zookeeper.Login.login(Login.java:304) - locked <0x000000151c477148> (a org.apache.zookeeper.Login) at org.apache.zookeeper.Login.<init>(Login.java:106) at org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslClient(ZooKeeperSaslClient.java:249) - locked <0x000000151c476f68> (a org.apache.zookeeper.client.ZooKeeperSaslClient) at org.apache.zookeeper.client.ZooKeeperSaslClient.<init>(ZooKeeperSaslClient.java:141) at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:972) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1031)
{code}
Note that these were logging in to our 10 ZK nodes but we had 100s of Logins.  In theory we  should only need at most 10 Logins.  

This Jira is intended to improve the behavior in limiting the number of Login objects/clients to the needed number.  Note that a combination of JIRAs https://issues.apache.org/jira/browse/ZOOKEEPER-2375 and https://issues.apache.org/jira/browse/ZOOKEEPER-2139  removed the singleton at the Login level but left in unnecessary synchronization code.  This could be again improved via either a singleton perhaps at the SaslClient layer or some sort of connection -> login cache so that new connections would reuse/wait for the same objects in failure paths.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)