You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by 涂扬 <tu...@meituan.com> on 2016/10/11 10:24:25 UTC

Can broker recover to alive when zk callback miss?

hi,
	we meet a issue that the temporary node of broker in zookeeper was lost when the network bewteen broker and zk cluster is not good enough, while the process of the broker still exist. as we know, the controller would consider it to be offline in kafka. After we open zkClient log, we can find when the connection state between broker and zk cluster is changed from disconnected to connected, but the newSession callback is not called.so this
broker can not recover to alive except restart.
	So we decide to add a heartbeat mechanism in the application layer  between client and broker that distinguish from zkclient heartbeat.  Can we immediately register this broker to zk when we detect broker temporary node is not in zk path. or how can we solve it?
	The main problem is that the watch callback has the possibility of miss, how can we solve it?
Thanks.