You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Dan Benediktson (JIRA)" <ji...@apache.org> on 2017/08/07 21:55:00 UTC

[jira] [Commented] (ZOOKEEPER-2869) Allow for exponential backoff in ClientCnxn.SendThread on connection re-establishment

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117378#comment-16117378 ] 

Dan Benediktson commented on ZOOKEEPER-2869:
--------------------------------------------

Our fork actually has solved this problem, using a standard jittered exponential backoff algorithm (the problem being addressed there was partially around addressing thundering herds, so jittering was deemed necessary for that).

I wouldn't mind porting our code and offering a patch for it; anything that gets us closer to upstream is goodness. However, we really need to take the fix I provided a year ago for ZOOKEEPER-2471 before doing this, otherwise allowing higher backoff than 1 second will dramatically increase the likelihood of clients getting completely wedged in a sleep/retry loop.

> Allow for exponential backoff in ClientCnxn.SendThread on connection re-establishment
> -------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2869
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2869
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: java client
>    Affects Versions: 3.4.10, 3.5.3
>            Reporter: Nick Travers
>            Priority: Minor
>
> As part of ZOOKEEPER-961, when the client re-establishes a connection to the server, it will sleep for a random number of milliseconds in the range [0, 1000). Introduced [here|https://github.com/apache/zookeeper/commit/d84dc077d576b7cdfbfd003e3425fab85ca29a44].
> These reconnects can cause excessive logging in clients if the server is unavailable for an extended period of time, with reconnects every 500ms on average.
> One solution could be to allow for exponential backoff in the client. The backoff params could be made configurable.
> [3.5.x code|https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1059].
> [3.4.x code|https://github.com/apache/zookeeper/blob/release-3.4.9/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1051].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)