You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Fangmin Lv (Jira)" <ji...@apache.org> on 2019/12/20 08:23:00 UTC

[jira] [Created] (ZOOKEEPER-3658) Potential data inconsistency due to txns gap in committedLog when ZkDB not fully shutdown

Fangmin Lv created ZOOKEEPER-3658:
-------------------------------------

             Summary: Potential data inconsistency due to txns gap in committedLog when ZkDB not fully shutdown
                 Key: ZOOKEEPER-3658
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3658
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.5.6, 3.6.0
            Reporter: Fangmin Lv
            Assignee: Fangmin Lv


During DIFF sync, the txns will be applied to learner's DataTree but it won't be added into the in memory committed txns cache in ZkDatabase. If this server became new leader later, and when other servers try to sync with it, it may cause data inconsistency due to part of txns are missing.

This is not a problem if we fully shutdown the ZkDB and reload from disk, but the current behavior in 3.5 and 3.6 will not fully shutdown the DB, which is a nice optimization to reduce the unavailable time with large snapshot.

Internally, we have another version of 'Retain DB' implementation, and we caught this issue with the digest feature we just upstreamed, and have fixed that internally. Just realized we haven't upstreamed that, and this is the Jira for that issue, will send a PR for this soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)