You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2018/11/20 18:39:03 UTC

[GitHub] kaijianding opened a new issue #6647: huge number of watch in zookeeper cause zookeeper full gc

kaijianding opened a new issue #6647: huge number of watch in zookeeper cause zookeeper full gc
URL: https://github.com/apache/incubator-druid/issues/6647
 
 
   I noticed there are about 100M watches in zookeeper in my product environment, and once I restart some of realtime tasks, zookeeper may full gc even it is configured to 32GB heap and druid is configured to use http server view.
   
   After investigation, this issue is caused by the Announcer.java implementation.
   The Announcer is trying best to avoid temporary disconnecting to zookeeper server and watch every child path under the specified base path, but for base path like /druid/prod/announcements and /druid/prod/listeners/lookups/__default/, there are plenty of hosts as child path under them.
   
   If there are 10000 realtime tasks(not single realtime job, but like 50 jobs), then each task will create 2 * 10000 watches, finally we will get 200M watches!!
   
   I'm not familiar with curator, can curator only watch a particular leaf path? Thus we can implement a simpler Announcer for these two pathes to reduce watch number in zookeeper to avoid zk OOM. @gianm 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org