You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Paul Chesnais <pa...@gmail.com> on 2022/11/30 21:34:45 UTC

Zxid in watches

Hello Zookeeper maintainers,

With the recent addition of persistent watches, many doors have opened up
to significantly more performant and intuitive local caches of remote
state. I implemented the new APIs in the golang ZK client
<https://github.com/go-zookeeper/zk/pull/89>, but found that implementing a
local cache to be very difficult without knowing which Zxid caused the
watch event to fire. This is because to cache data locally, one needs to
execute the following steps:

1. Set the watch
2. Bootstrap the watched subtree
3. Catch up on the events that fired during the bootstrap

The issue is it's now very difficult to deduplicate and sanely resolve the
remote state during step 3 because it's unknown whether an event arrived
during the bootstrap or after. For example, imagine that between steps 1
and 2, a node /a was deleted then re-created. By the time step 3 is
executed, there will be a NodeDeleted event queued up followed by a
NodeCreated, causing at best a double read (one from the bootstrap, one
from the NodeCreated) or at worst some data inconsistencies in the local
cache.

Because there's already a Zxid field in the response header, I propose to
set it to the Zxid that triggered the watch to fire. This will allow
clients to correctly handle changes to local state and is wire-backwards
compatible with older versions of ZK clients.

I have implemented an initial version of this
<https://github.com/PapaCharlie/zookeeper/compare/apache:zookeeper:master...master>,
please let me know what you think! Once an issue is opened, I can open the
PR and get a proper review.

Cheers,

Paul Chesnais