You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zookeeper.apache.org by nk...@apache.org on 2019/09/10 07:53:51 UTC

[zookeeper] branch master updated: ZOOKEEPER-3124: Add the correct comment to show why we need the special logic to handle cversion and pzxid

This is an automated email from the ASF dual-hosted git repository.

nkalmar pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/zookeeper.git


The following commit(s) were added to refs/heads/master by this push:
     new 692ea8b  ZOOKEEPER-3124: Add the correct comment to show why we need the special logic to handle cversion and pzxid
692ea8b is described below

commit 692ea8bd681d741845c77040c1d991e34d2e91fe
Author: Fangmin Lyu <al...@fb.com>
AuthorDate: Tue Sep 10 09:53:43 2019 +0200

    ZOOKEEPER-3124: Add the correct comment to show why we need the special logic to handle cversion and pzxid
    
    There is special logic in the DataTree.processTxn to handle the NODEEXISTS when createNode, which is used to handle the cversion and pzxid not being updated due to fuzzy snapshot:
    
    https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994.
    
    But seems this is not a real issue, in the current code, when serializing a parent node, we'll lock on it, and take a children snapshot at that time. If the child added after the parent is serialized to disk, then it won't be written out, so we shouldn't hit the issue where the child is in the snapshot but parent cversion and pzxid is not changed.
    
    But maybe I'm missing something, there is not much discussion in the Jira, so create a PR to have more attention.
    
    Author: Fangmin Lyu <al...@fb.com>
    
    Reviewers: Norbert Kalmar <nk...@apache.org>
    
    Closes #610 from lvfangmin/ZOOKEEPER-3124
---
 .../java/org/apache/zookeeper/server/DataTree.java   | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java b/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java
index 09a8f4b..0c4f223 100644
--- a/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java
+++ b/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java
@@ -1027,15 +1027,19 @@ public class DataTree {
         }
 
         /*
-         * Snapshots are taken lazily. It can happen that the child
-         * znodes of a parent are created after the parent
-         * is serialized. Therefore, while replaying logs during restore, a
-         * create might fail because the node was already
-         * created.
+         * Snapshots are taken lazily. When serializing a node, it's data
+         * and children copied in a synchronization block on that node,
+         * which means newly created node won't be in the snapshot, so
+         * we won't have mismatched cversion and pzxid when replaying the
+         * createNode txn.
          *
-         * After seeing this failure, we should increment
-         * the cversion of the parent znode since the parent was serialized
-         * before its children.
+         * But there is a tricky scenario that if the child is deleted due
+         * to session close and re-created in a different global session
+         * after that the parent is serialized, then when replay the txn
+         * because the node is belonging to a different session, replay the
+         * closeSession txn won't delete it anymore, and we'll get NODEEXISTS
+         * error when replay the createNode txn. In this case, we need to
+         * update the cversion and pzxid to the new value.
          *
          * Note, such failures on DT should be seen only during
          * restore.