You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@curator.apache.org by "Enrico Olivelli (Jira)" <ji...@apache.org> on 2021/07/15 10:30:00 UTC

[jira] [Assigned] (CURATOR-599) Hanging indefinitely on some scenarios since zookeeper.request.timeout cannot be configured (add support for ZKClientConfig)

     [ https://issues.apache.org/jira/browse/CURATOR-599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Enrico Olivelli reassigned CURATOR-599:
---------------------------------------

    Assignee: Enrico Olivelli

> Hanging indefinitely on some scenarios since zookeeper.request.timeout cannot be configured (add support for ZKClientConfig)
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CURATOR-599
>                 URL: https://issues.apache.org/jira/browse/CURATOR-599
>             Project: Apache Curator
>          Issue Type: Improvement
>          Components: Client
>    Affects Versions: 5.1.0
>            Reporter: Liran Mendelovich
>            Assignee: Enrico Olivelli
>            Priority: Major
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> On some executions where ZooKeeper server is not available, Curator client got waiting and hanging indefinitely, with thread dump stack trace which can be seen
> below.
> As this is not reproduced consistently, it seems like a race condition from Curator/ZooKeeper client, since zookeeper.request.timeout cannot be configured in Curator client.
> As a work-around solution, initialization is executed in a separate thread in order to interrupt it if it hangs. This has been identified and handled here:
> [join_while_zookeeper_down_issue|https://github.com/CiscoSE/commons-cluster/blob/main/docs/join_while_zookeeper_down_issue.txt]
> The wanted solution is expose configuration to be able to configure zookeeper.request.timeout, then it should wait until the request timeout, which is treated at org.apache.zookeeper.ClientCnxn.submitRequest().
>  
> stacktrace size: 31
> java.lang.Object.wait(Native Method)
> java.lang.Object.wait(Object.java:502)
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1561)
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1533)
> org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:1834)
> org.apache.curator.framework.imps.CreateBuilderImpl$16.call(CreateBuilderImpl.java:1131)
> org.apache.curator.framework.imps.CreateBuilderImpl$16.call(CreateBuilderImpl.java:1113)
> org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:1110)
> org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:593)
> org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:583)
> org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:48)
> org.apache.curator.x.discovery.details.ServiceDiscoveryImpl.internalRegisterService(ServiceDiscoveryImpl.java:237)
> org.apache.curator.x.discovery.details.ServiceDiscoveryImpl.reRegisterServices(ServiceDiscoveryImpl.java:456)
> org.apache.curator.x.discovery.details.ServiceDiscoveryImpl.start(ServiceDiscoveryImpl.java:135)
> ...
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)