You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@curator.apache.org by "Liran Mendelovich (Jira)" <ji...@apache.org> on 2021/06/29 12:57:00 UTC

[jira] [Created] (CURATOR-599) Hanging indefinitely on some scenarios since zookeeper.request.timeout cannot be configured

Liran Mendelovich created CURATOR-599:
-----------------------------------------

             Summary: Hanging indefinitely on some scenarios since zookeeper.request.timeout cannot be configured
                 Key: CURATOR-599
                 URL: https://issues.apache.org/jira/browse/CURATOR-599
             Project: Apache Curator
          Issue Type: Improvement
          Components: Client
    Affects Versions: 5.1.0
            Reporter: Liran Mendelovich


On some executions where ZooKeeper server is not available, Curator client got waiting and hanging indefinitely, with thread dump stack trace which can be seen
below.
As this is not reproduced consistently, it seems like a race condition from Curator/ZooKeeper client, since zookeeper.request.timeout cannot be configured in Curator client.
As a work-around solution, initialization is executed in a separate thread in order to interrupt it if it hangs. This has been identified and handled here:

[join_while_zookeeper_down_issue|https://github.com/CiscoSE/commons-cluster/blob/main/docs/join_while_zookeeper_down_issue.txt]

The wanted solution is expose configuration to be able to configure zookeeper.request.timeout, then it should wait until the request timeout, which is treated at org.apache.zookeeper.ClientCnxn.submitRequest().

 

stacktrace size: 31
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:502)
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1561)
org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1533)
org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:1834)
org.apache.curator.framework.imps.CreateBuilderImpl$16.call(CreateBuilderImpl.java:1131)
org.apache.curator.framework.imps.CreateBuilderImpl$16.call(CreateBuilderImpl.java:1113)
org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:1110)
org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:593)
org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:583)
org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:48)
org.apache.curator.x.discovery.details.ServiceDiscoveryImpl.internalRegisterService(ServiceDiscoveryImpl.java:237)
org.apache.curator.x.discovery.details.ServiceDiscoveryImpl.reRegisterServices(ServiceDiscoveryImpl.java:456)
org.apache.curator.x.discovery.details.ServiceDiscoveryImpl.start(ServiceDiscoveryImpl.java:135)
...

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)