You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@curator.apache.org by Rodrigo Nogueira <ra...@gmail.com> on 2015/01/12 19:58:30 UTC

Node removed and added after the leader zk node went down

Hi all,

I am developing an application that uses the curator-x-discovery to
register and remove nodes from the zookeeper.

My application creates a ServiceInstance using the ServiceType.PERMANENT
and stores some basic info in the payload of each node. Then, the
ServiceDiscovery persists the node and its respective payload.
(pretty much like this example:
https://git-wip-us.apache.org/repos/asf?p=curator.git;a=blob;f=curator-examples/src/main/java/discovery/ExampleServer.java;h=96478f5fb079b2ab80d6f131a2e17d3e5c495b67;hb=HEAD
)

I am using a zk ensemble with 3 nodes. The nodes are in the same machine,
the initLimit is 5, the syncLimit is 2 and the tickTime is 2000.

The problem occurs when the leader zk node fails.
I killed the leader zk node and all nodes are removed and added to the
zookeeper (nodes registered by the ServiceDiscovery). The nodes are capable
of reestablishing the connection to the zookeeper without loosing any data,
but the creation time (cTime) of the node changes.

Is this the expected behavior ?

I expected that if the zk node failed, a leader election would be started
to elect a new leader to the quorum (with only 2 nodes), without removing
and adding (therefore changing the cTime) the nodes previously registered.

I also have some PathChildrenCacheListener registered on the paths where
the nodes are created and when a kill the leader zk node, a CHILD_REMOVED
event is triggered. I think this behavior is because the nodes are removed
and added from the zookeeper.

I would like to avoid that the nodes are removed from the zookeeper when
the leader zk node fails. Thus, the I think the PathChildrenCacheListener
will not trigger any undesired event.

I hope I have been clear enough describing my problem.

Thanks,
Rodrigo Nogueira