You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zookeeper.apache.org by sy...@apache.org on 2021/03/09 17:06:22 UTC

[zookeeper] branch branch-3.6 updated: ZOOKEEPER-4220: Potential redundant connection attempts during leader election

This is an automated email from the ASF dual-hosted git repository.

symat pushed a commit to branch branch-3.6
in repository https://gitbox.apache.org/repos/asf/zookeeper.git


The following commit(s) were added to refs/heads/branch-3.6 by this push:
     new 9f39c33  ZOOKEEPER-4220: Potential redundant connection attempts during leader election
9f39c33 is described below

commit 9f39c33cc352be24ab8402d510aadc72849d650e
Author: Mate Szalay-Beko <sy...@apache.org>
AuthorDate: Tue Mar 9 17:05:54 2021 +0000

    ZOOKEEPER-4220: Potential redundant connection attempts during leader election
    
    We have a logic in the server code, that would try to connect to an other quorum member, based
    on its server ID. We identify the address assigned to this ID first based on the last committed
    quorum configuration. If the connection attempt fails (or the server is not known in the
    committed configuration) then we try to find the address based on the last proposed quorum
    configuration. But we should do the second connection attempt, only if the address in the
    last proposed configuration differs from the address in the last committed configuration.
    Otherwise we would just retry to connect to the same address that failed just right before.
    
    In the current code we have a bug, because we compare the address object references (use "!=")
    instead of comparing the objects themselves (using "not equals"). In certain edge cases (e.g.
    when the last proposed and last committed addresses are the same, but the address is unreachable)
    this bug can lead to unnecessary retry of connection attempts. The normal behaviour would be to
    mark this connection attempt to be failed and wait for e.g. the next election round or wait for
    the other server to come online and initiate a connection to us.
    
    Author: Mate Szalay-Beko <sy...@apache.org>
    
    Reviewers: Andor Molnar <an...@apache.org>, Damien Diederen <dd...@crosstwine.com>
    
    Closes #1630 from symat/ZOOKEEPER-4220-branch-3.6
---
 .../main/java/org/apache/zookeeper/server/quorum/QuorumCnxManager.java  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumCnxManager.java b/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumCnxManager.java
index fc6ed5f..793c2ee 100644
--- a/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumCnxManager.java
+++ b/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumCnxManager.java
@@ -767,7 +767,7 @@ public class QuorumCnxManager {
             if (lastSeenQV != null
                 && lastProposedView.containsKey(sid)
                 && (!knownId
-                    || (lastProposedView.get(sid).electionAddr != lastCommittedView.get(sid).electionAddr))) {
+                    || !lastProposedView.get(sid).electionAddr.equals(lastCommittedView.get(sid).electionAddr))) {
                 knownId = true;
                 LOG.debug("Server {} knows {} already, it is in the lastProposedView", self.getId(), sid);