You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Eugene Dzhurinsky <jd...@gmail.com> on 2013/09/25 02:38:19 UTC

Holding hundreds of thousands nodes in a cluster with high throughput

Hello!

I think of implementing a distributed queue and distributed map using
ZooKeeper.

I need to store somewhat like 50-100K of nodes being processed in a cluster on
ZooKeepers. Most operations would be

- check if node exists
- create node if it does not exist
- update node data (atomically)
- rename node from one subtree to another one
- concurrently queried/updated by 5 processes and 10 threads each, so 50-100
clients connected, estimated 100-500 requests per second.

I am concerned about network traffic in the cluster and memory usage on each
node (since it will hold entire replica of data).

Please advice if ZooKeeper is a correct library for that.

Thanks!

-- 
Eugene N Dzhurinsky