You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Kezhu Wang (Jira)" <ji...@apache.org> on 2023/05/23 04:05:00 UTC

[jira] [Created] (ZOOKEEPER-4698) Persistent watch events lost after reconnection

Kezhu Wang created ZOOKEEPER-4698:
-------------------------------------

             Summary: Persistent watch events lost after reconnection
                 Key: ZOOKEEPER-4698
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4698
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.8.1, 3.7.1
            Reporter: Kezhu Wang


I found this in reply to [apache#1950 (comment)|https://github.com/apache/zookeeper/pull/1950#issuecomment-1553742525]. But it turns out a known issue [apache#1106 (comment)|https://github.com/apache/zookeeper/pull/1106#issuecomment-543860329].

I think it is worth to note separately in jira for potential future discussions and fix. I have pushed a [test case|https://github.com/kezhuw/zookeeper/commit/31d89e9829380559066fc2b83e3d38462380c5d4] for this. It fails as expected.

{noformat}
[ERROR] Failures: 
[ERROR]   WatchEventWhenAutoResetTest.testPersistentRecursiveWatch:237 do not receive a NodeDataChanged ==> expected: not <null>
[ERROR]   WatchEventWhenAutoResetTest.testPersistentWatch:211 do not receive a NodeDataChanged ==> expected: not <null>
{noformat}

It is hard to fix this with sole {{DataTree}}. Two independent comments [pointed|https://github.com/apache/zookeeper/pull/1106#issuecomment-1366449561] [out|https://github.com/kezhuw/zookeeper/commit/31d89e9829380559066fc2b83e3d38462380c5d4#diff-cfd09b7021c88da6631872e8a4a271f830162f7c5a63a140839ba029048493fdR227-R230] this. I guess we have to walk through txn log to deliver a correct fix. 

{quote}
Watches will not be received while disconnected from a server. When a client reconnects, any previously registered watches will be reregistered and triggered if needed. In general this all occurs transparently. There is one case where a watch may be missed: a watch for the existence of a znode not yet created will be missed if the znode is created and deleted while disconnected.
{quote}

This is [what our programer's guide says|https://zookeeper.apache.org/doc/r3.8.1/zookeeperProgrammers.html#ch_zkWatches]. It is well-known, at least for me, that we can lose some transiently intermediate events in reconnection. But in case of persistent watch, we can lose more. This forces clients to rebuild their knowledge on reconnection.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)