You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Eric Allen <er...@adroll.com> on 2014/11/23 04:39:21 UTC

Are there adverse effects from increasing task.heartbeat.frequency.secs?

I'm responsible for a topology of 12 EC2 instances, running a total of
~2500 executors across 81 workers. Recently we increased the number of
executors, and the Zookeeper instance dedicated to this Storm cluster has
started falling over because its small disk is exhausted by logs. This is,
of course, tractable by increasing the disk space available to Zookeeper,
but I'd like to see if we can find a cleaner solution. We're already
cleaning logs hourly to the standard minimum of 3 snapshots, but it's not
enough.

What are the adverse effects, if any, of increasing
task.heartbeat.frequency.secs from the default value of 3? Based on my
reading of the Storm source, increasing it should linearly reduce the rate
of setData events to Zookeeper, and in turn the rate of accumulation of
logs on disk. Are there timeouts we need to be careful of violating by
reducing the frequency of heartbeats from executors?


-- 
Eric Allen
Software Engineer | www.adroll.com
<http://www.google.com/url?q=http%3A%2F%2Fwww.adroll.com%2F&sa=D&sntz=1&usg=AFrqEzfbgqVT4nqZBiJYAZ59pVVdbrPWiw>
 | 408.228.7180

*SF Business Times: *AdRoll named a "Best Place To Work
<http://www.google.com/url?q=http%3A%2F%2Fblog.adroll.com%2Fbest-places-to-work-sf-biz-times&sa=D&sntz=1&usg=AFrqEzdN43WQ2Jmsm96ucT5fTQhPmKr5PA>
" Two Years in a Row

The Retargeting Playbook
<http://www.google.com/url?q=http%3A%2F%2Fwww.adroll.com%2Fresources%2Fthe_retargeting_playbook&sa=D&sntz=1&usg=AFrqEzeY53Ecvkbiz8sQDunSPPBNFbggBA>,
now available in stores and online