You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2020/06/30 12:53:07 UTC

[GitHub] [zookeeper] Hinterwaeldlers commented on pull request #1391: ZOOKEEPER-3526: data inconsistency due to mistaken TRUNC caused by maxCommittedLog is much less than minCommittedLog when in readonly mode

Hinterwaeldlers commented on pull request #1391:
URL: https://github.com/apache/zookeeper/pull/1391#issuecomment-651771189


   Hi thats correct, I am the same person.
   
   The problem within the regular processing is, that when the quorum is lost and the server switches into the readonly mode, a new ReadOnlyZooKeeper is created, where the zxid is never set and using the initial value of 0.
   Whenever a clients connect and the connection is closed with a timeout, an entry to the log is created with the incremented zxid. As the value between the regular ZooKeeperServer and the ReadOnlyServer are not shared an entry with 0x01 is created and stored in the common zkDatabase.
   Within the further leader election, the commit difference is done based on the maxCommitLog Id. As result, the first leader election will use the previous used commitLogId of the ZooKeeperServer. Assuming no further change happened on both sides, a diff with 0 elements is exchanged and a new epoche with a new commitLogId is started.
   Any further server, joining the quorum will receive the new commitLogId and the diff is done based on the maxCommitLogId. As the value is < 0x100000000 a truncate is performed.
   So far this fits the description done in Jira.
   
   My initial thought was to use a static zxid value, so it is shared between all ZooKeeperServer instances. As multiple servers are running within the unit tests, using a shared zxid is not and option so it must be defined in a common place and forwarded into all instances of the ZooKeeperServer. As far as I've checked the code the QuorumPeer is this point.
   
   As result, all tests using the ZooKeeperServer must provide the zxid.
   One possible solution would be the definition of a constructor, which provide the AtomicLong for the tests cases and removes the need to define it there. The drawback of this solution is the presence of a constructor, which is meant to be used only for testing purpose.
   Therefore I preferred a solution, which keep the current behaviour (using a none shared AtomicLong) and do not introduce test only constructors. As I've found the usage of the same constructor with same parameters in different tests, I didn't thought this has any drawbacks.
   What I've noticed was that some tests make use QuorumPeer so maybe change the visibility from private to protected and make use of it might be an option


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org