You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Jeff Jenkins <jj...@foursquare.com> on 2015/06/09 15:38:06 UTC

Overhead of Having Lots of Watches

I've got a few thousand clients connected to a ZK cluster that retrieve a ~100KB json blob with some configuration. When any configuration value is updated  it causes a significant network spike as every client tries to get the new json blob. Multiple updates at the same time can also cause a bit of client thrashing as the clients try to push their updated json and it's rejected because some other client was pushing a different change.

There are a variety of ways to work around these issues, but it got me curious how expensive it is to keep watches. Let's say I have ~10000 configuration values and ~1000 clients. Could I create one ZK node for each configuration value and have each client place a watch on every node (for a total of ~10M watches)? The docs[1] weren't super clear on how they're implemented, so I can't tell how expensive they are. 

Using this method would make my updates MUCH cheaper because I'd be updating a trivially small node and sending that tiny amount of data to the same number of clients. What I'm not sure about is if the steady state of so many watches is going to overload the zk cluster. I'm also curious if adding observer nodes to the cluster would be a good idea in this scenario.

Any help would be appreciated!

-Jeff

[1] http://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#ch_zkWatches <http://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#ch_zkWatches>