You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2021/03/01 09:46:14 UTC

[GitHub] [zookeeper] symat commented on pull request #1615: ZOOKEEPER-4220: Redundant connection attempts during leader election if quorum members changed

symat commented on pull request #1615:
URL: https://github.com/apache/zookeeper/pull/1615#issuecomment-787812974


   unfortunately it is not really possible to create any clean and non-flaky unit test. The problem is with the async connection initiation, which makes the problem hard to reproduce. Since https://issues.apache.org/jira/browse/ZOOKEEPER-3756, we are always initiating leader election connections asynchronously. Before submitting the new connection initiation thread to the executor, we check if already is a thread submitted for the given address. Depending on the scheduling of the JVM / CPU, we may or may not submit the redundant connection attempt we try to fix here.
   
   We could introduce some configurable (only-visible-for-tests) sleep inside the QuorumCnxManager to a certain point making sure we indeed hit this problem. But I'm not favour of complicating the production code this way. 
   
   I spent a few hours to make a nice test, but now I kind of gave up. I think this is a trivial fix, I can live without testing this edge case. What do you think?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org