You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by anmolnar <gi...@git.apache.org> on 2017/12/01 15:45:56 UTC

[GitHub] zookeeper issue #300: ZOOKEEPER-2807: Flaky test: org.apache.zookeeper.test....

Github user anmolnar commented on the issue:

    https://github.com/apache/zookeeper/pull/300
  
    @afine Generally speaking, I like the idea of using LinkedBlockingQueue's intrinsic lock to wait for becoming empty, but in this particular case I think it's possible that committedRequests will never be empty if the leader is constantly sending commit requests.
    
    Correct me if I'm wrong please (there's a very good chance that I completely misunderstand something), but my feeling is that the following situation is possible:
    1. Learner starts syncing with leader in syncWithLeader() method,
    2. Learner blocks and wait for all commits to be processed before finishing the sync,
    3. FollowerZookeeperServer is already running and keep receiving commits from the Leader including non-syncing ones,
    4. Learner will never be notified or only at some point in the future much more later then sync complete or way before that.
    
    To address this, if we could get the number of commits that we must wait before proceeding, we would be able to implement a CountDownLatch in CommitProcessor and wait for the number of commits which are expected in the sync process. However that does not guarantee that we received all sync-related commits either.
    
    Otherwise I could also agree with @shralex in the Jira saying: "Intuitively this may not be the right place for such a fix - this probably should be higher level - **making sure that follower does not even accept local ops before properly completing the sync.** Even if you drain the committedRequests, I'm not sure that guarantees that there are no more that will arrive."
    
    That would be the best solution here in my opinion.


---