You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@zookeeper.apache.org by "Michael Han (Jira)" <ji...@apache.org> on 2020/10/03 02:14:00 UTC

[jira] [Resolved] (ZOOKEEPER-3774) Close quorum socket asynchronously on the leader to avoid ping being blocked by long socket closing time

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Han resolved ZOOKEEPER-3774.
------------------------------------
    Fix Version/s:     (was: 3.7.0)
                   3.6.3
       Resolution: Fixed

Issue resolved by pull request 1301
[https://github.com/apache/zookeeper/pull/1301]

> Close quorum socket asynchronously on the leader to avoid ping being blocked by long socket closing time
> --------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3774
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3774
>             Project: ZooKeeper
>          Issue Type: Sub-task
>          Components: server
>            Reporter: Jie Huang
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 3.6.3
>
>          Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> In ZOOKEEPER-3574 we close the quorum sockets on followers asynchronously when a leader is partitioned away so the shutdown process will not be stalled by long socket closing time and the followers can quickly establish a new quorum to serve client requests.
> We've found that the long socket closing time can cause trouble on the leader too when a follower is partitioned away if the partition is detected by PingLaggingDetector. When the ping thread detects partition, it tries to disconnect the follower. If the socket closing time is long, the ping thread will be blocked and no ping is sent to any follower--even the ones still connected to the leader--since the ping thread is responsible for sending pings to all followers. When followers don't receive pings, they don't send ping response. When the leader don't receive ping response, the sessions expire. 
> To prevent good sessions from expiring, we need to close the socket asynchronously on the leader too.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)