You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Eric Ruel <er...@wantedanalytics.com> on 2015/07/22 16:43:07 UTC

worker dies after view minutes

Hello


the workers in my topology dies after 1,2 minutes


I tried to change the config about the heartbeat, cluster or local mode, but they always die


any idea?


10:38:38.019 ERROR backtype.storm.daemon.worker - Error when processing event

java.lang.RuntimeException: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:173) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_distributed_cluster_state$reify__1919.set_data(cluster.clj:92) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_storm_cluster_state$reify__2376.worker_heartbeat_BANG_(cluster.clj:332) ~[storm-core-0.9.6.jar:0.9.6]

at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source) ~[na:na]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_71]

at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]

at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.5.1.jar:na]

at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$do_executor_heartbeats.doInvoke(worker.clj:56) ~[storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____3758$fn__3761.invoke(worker.clj:413) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$schedule_recurring$this__1704.invoke(timer.clj:99) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:50) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

Caused by: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:99) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:51) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:260) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:256) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:252) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:239) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:39) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:172) ~[storm-core-0.9.6.jar:0.9.6]

... 15 common frames omitted

10:38:38.023 ERROR backtype.storm.util - Halting process: ("Error when processing an event")

java.lang.RuntimeException: ("Error when processing an event")

at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$mk_halting_timer$fn__3572.invoke(worker.clj:177) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:68) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

RE: worker dies after view minutes

Posted by Eric Ruel <er...@wantedanalytics.com>.

finally, I am able to increase the parallelism and keep my worker alive

in fact the problem was not the parallellism but the number of tasks...

I had to set in the parameters of zookeeper and the nimbus
-Djute.maxbuffer=33554432

a lower value would probably works to

the nimbus disallowed my worker due to an error of unreasonable length but with the -Djute.maxbuffer, it works well

________________________________
De : Harsha <st...@harsha.io>
Envoyé : 24 juillet 2015 22:11
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

you can try increasing the supervisor.worker.timeout.secs . At basic level your parallelism depends on the number of cpus as you are increasing no.of threads executing the vm. You probably want to increase the JVM memory as well.


On Fri, Jul 24, 2015, at 05:58 AM, Eric Ruel wrote:



is there a limit of bolts/threads we can have within a single worker?

our topology has almost 140 bolts including those created by trident, and if I increase the parallelism, the worker dies

is it caused by the process that check if every tasks are alive, and it takes too much time to do the whole loop, or something like that?

is there any values in storm.yaml I should modify to support that number of threads?
________________________________

De : Eric Ruel <er...@wantedanalytics.com>
Envoyé : 23 juillet 2015 13:57
À : user@storm.apache.org
Objet : RE: worker dies after view minutes




Originally, we had multiple topologies single worker with a maxSpoutpending of 3, a parallellimshint between 1 and 20 depending on the bolt,a batch size of 300 and a maxTask of 90


by reducing the batchsize to 2 records, and saw that my workers died always after about ~45 seconds... so I just did many tries by changing values of all parameters


I still don't totally understand the difference between the maxtask and the paralellism hint



I only remember that originally, we preferred to reduce the parallellismHint to minimum to avoid the problem caused by https://issues.apache.org/jira/browse/STORM-503


note that we use trident, so if I count all bolts I can see under the section "Bolt (All time)" in storm UI, I have 137 bolts (including all merges, joins, project...)



Eric



________________________________

De : Harsha <st...@harsha.io>
Envoyé : 23 juillet 2015 11:14
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

Thanks for update Eric. Could you describe how did you find that maxTask too high causing this issue. We are trying to improve debugging of storm topologies , this will be helpful for us.
Thanks,
Harhsa

On Thu, Jul 23, 2015, at 07:36 AM, Eric Ruel wrote:


finally the problem was caused by a maxTask too high


________________________________

De : Harsha <st...@harsha.io>
Envoyé : 22 juillet 2015 10:56
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

how is your topology code looks like are you throwing any errors from bolt's execute method?. It does look like there is a RuntimeException happening
"
Error when processing event
java.lang.RuntimeException:
"
Its up to the user to catch any exception and log or do something with instead of throwing it back to worker jvm

-Harsha


On Wed, Jul 22, 2015, at 07:43 AM, Eric Ruel wrote:

Hello


the workers in my topology dies after 1,2 minutes


I tried to change the config about the heartbeat, cluster or local mode, but they always die


any idea?


10:38:38.019 ERROR backtype.storm.daemon.worker - Error when processing event

java.lang.RuntimeException: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:173) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_distributed_cluster_state$reify__1919.set_data(cluster.clj:92) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_storm_cluster_state$reify__2376.worker_heartbeat_BANG_(cluster.clj:332) ~[storm-core-0.9.6.jar:0.9.6]

at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source) ~[na:na]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_71]

at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]

at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.5.1.jar:na]

at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$do_executor_heartbeats.doInvoke(worker.clj:56) ~[storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____3758$fn__3761.invoke(worker.clj:413) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$schedule_recurring$this__1704.invoke(timer.clj:99) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:50) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

Caused by: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:99) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:51) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:260) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:256) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:252) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:239) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:39) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:172) ~[storm-core-0.9.6.jar:0.9.6]

... 15 common frames omitted

10:38:38.023 ERROR backtype.storm.util - Halting process: ("Error when processing an event")

java.lang.RuntimeException: ("Error when processing an event")

at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$mk_halting_timer$fn__3572.invoke(worker.clj:177) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:68) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

RE: worker dies after few minutes

Posted by Eric Ruel <er...@wantedanalytics.com>.

it does not work

what i see during a strace is


[pid 35275] open("/data/storm/workers/06f5e71e-ea3f-4ffc-93e5-5263dd12920d/pids/35274", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 ENOENT (No such file or directory)


[pid 35275] write(59, "2015-07-30T14:17:26.297-0400 b.s.d.worker [ERROR] Error on initialization of server mk-worker\njava.io.IOException: No such file or directory\n\tat java.io.UnixFileSystem.createFileExclusively(Native Method) ~[na:1.7.0_67]\n\tat java.io.File.createNewFile(File.java:1006) ~[na:1.7.0_67]\n\tat backtype.storm.util$touch.invoke(util.clj:525) ~[storm-core-0.9.6.jar:0.9.6]\n\tat backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____3758.invoke(worker.clj:401) ~[storm-core-0.9.6.jar:0.9.6]\n\tat clojure.lang.AFn.applyToHelper(AFn.java:185) [clojure-1.5.1.jar:na]\n\tat clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na]\n\tat clojure.core$apply.invoke(core.clj:617) ~[clojure-1.5.1.jar:na]\n\tat backtype.storm.daemon.worker$fn__3757$mk_worker__3813.doInvoke(worker.clj:393) [storm-core-0.9.6.jar:0.9.6]\n\tat clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.5.1.jar:na]\n\tat backtype.storm.daemon.worker$_main.invoke(worker.clj:504) [storm-core-0.9.6.jar:0.9.6]\n\tat clojure.lang.AFn.applyToHelper"..., 1187 <unfinished ...>


[pid 35275] write(59, "2015-07-30T14:17:26.305-0400 b.s.util [ERROR] Halting process: (\"Error on initialization\")\njava.lang.RuntimeException: (\"Error on initialization\")\n\tat backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-0.9.6.jar:0.9.6]\n\tat clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]\n\tat backtype.storm.daemon.worker$fn__3757$mk_worker__3813.doInvoke(worker.clj:393) [storm-core-0.9.6.jar:0.9.6]\n\tat clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.5.1.jar:na]\n\tat backtype.storm.daemon.worker$_main.invoke(worker.clj:504) [storm-core-0.9.6.jar:0.9.6]\n\tat clojure.lang.AFn.applyToHelper(AFn.java:172) [clojure-1.5.1.jar:na]\n\tat clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na]\n\tat backtype.storm.daemon.worker.main(Unknown Source) [storm-core-0.9.6.jar:0.9.6]\n", 808) = 808



I have a single topo with 1 worker 368 executor and 386 tasks, with Trident


is my problem similar to https://issues.apache.org/jira/browse/STORM-885
Do we have an idea when it will be completed?



________________________________
De : Harsha <st...@harsha.io>
Envoyé : 24 juillet 2015 22:11
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

you can try increasing the supervisor.worker.timeout.secs . At basic level your parallelism depends on the number of cpus as you are increasing no.of threads executing the vm. You probably want to increase the JVM memory as well.


On Fri, Jul 24, 2015, at 05:58 AM, Eric Ruel wrote:



is there a limit of bolts/threads we can have within a single worker?

our topology has almost 140 bolts including those created by trident, and if I increase the parallelism, the worker dies

is it caused by the process that check if every tasks are alive, and it takes too much time to do the whole loop, or something like that?

is there any values in storm.yaml I should modify to support that number of threads?
________________________________

De : Eric Ruel <er...@wantedanalytics.com>
Envoyé : 23 juillet 2015 13:57
À : user@storm.apache.org
Objet : RE: worker dies after view minutes




Originally, we had multiple topologies single worker with a maxSpoutpending of 3, a parallellimshint between 1 and 20 depending on the bolt,a batch size of 300 and a maxTask of 90


by reducing the batchsize to 2 records, and saw that my workers died always after about ~45 seconds... so I just did many tries by changing values of all parameters


I still don't totally understand the difference between the maxtask and the paralellism hint



I only remember that originally, we preferred to reduce the parallellismHint to minimum to avoid the problem caused by https://issues.apache.org/jira/browse/STORM-503


note that we use trident, so if I count all bolts I can see under the section "Bolt (All time)" in storm UI, I have 137 bolts (including all merges, joins, project...)



Eric



________________________________

De : Harsha <st...@harsha.io>
Envoyé : 23 juillet 2015 11:14
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

Thanks for update Eric. Could you describe how did you find that maxTask too high causing this issue. We are trying to improve debugging of storm topologies , this will be helpful for us.
Thanks,
Harhsa

On Thu, Jul 23, 2015, at 07:36 AM, Eric Ruel wrote:


finally the problem was caused by a maxTask too high


________________________________

De : Harsha <st...@harsha.io>
Envoyé : 22 juillet 2015 10:56
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

how is your topology code looks like are you throwing any errors from bolt's execute method?. It does look like there is a RuntimeException happening
"
Error when processing event
java.lang.RuntimeException:
"
Its up to the user to catch any exception and log or do something with instead of throwing it back to worker jvm

-Harsha


On Wed, Jul 22, 2015, at 07:43 AM, Eric Ruel wrote:

Hello


the workers in my topology dies after 1,2 minutes


I tried to change the config about the heartbeat, cluster or local mode, but they always die


any idea?


10:38:38.019 ERROR backtype.storm.daemon.worker - Error when processing event

java.lang.RuntimeException: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:173) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_distributed_cluster_state$reify__1919.set_data(cluster.clj:92) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_storm_cluster_state$reify__2376.worker_heartbeat_BANG_(cluster.clj:332) ~[storm-core-0.9.6.jar:0.9.6]

at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source) ~[na:na]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_71]

at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]

at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.5.1.jar:na]

at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$do_executor_heartbeats.doInvoke(worker.clj:56) ~[storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____3758$fn__3761.invoke(worker.clj:413) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$schedule_recurring$this__1704.invoke(timer.clj:99) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:50) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

Caused by: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:99) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:51) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:260) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:256) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:252) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:239) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:39) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:172) ~[storm-core-0.9.6.jar:0.9.6]

... 15 common frames omitted

10:38:38.023 ERROR backtype.storm.util - Halting process: ("Error when processing an event")

java.lang.RuntimeException: ("Error when processing an event")

at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$mk_halting_timer$fn__3572.invoke(worker.clj:177) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:68) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

Re: worker dies after view minutes

Posted by Harsha <st...@harsha.io>.

you can try increasing the supervisor.worker.timeout.secs . At basic
level your parallelism depends on the number of cpus as you are
increasing no.of threads executing the vm. You probably want to increase
the JVM memory as well.


On Fri, Jul 24, 2015, at 05:58 AM, Eric Ruel wrote:
>


>
>
is there a limit of bolts/threads we can have within a single worker?
>
> our topology has almost 140 bolts including those created by trident,
> and if I increase the parallelism, the worker dies
>
> is it caused by the process that check if every tasks are alive, and
> it takes too much time to do the whole loop, or something like that?
>
> is there any values in storm.yaml I should modify to support that
> number of threads?
>
> *De :* Eric Ruel <er...@wantedanalytics.com> *Envoy� :* 23 juillet
> 2015 13:57 *� :* user@storm.apache.org *Objet :* RE: worker dies after
> view minutes
>
>


>


> Originally, we had multiple topologies single worker with a
> maxSpoutpending of 3, a parallellimshint between 1 and 20 depending on
> the bolt,a batch size of 300 and a maxTask of 90


>


> by reducing the batchsize to 2 records, and�saw that my workers died
> always after about ~45 seconds... so I just did many tries by changing
> values of all parameters


>


> I still don't totally understand the difference between the maxtask
> and the paralellism hint


>


>


> I only remember that originally, we preferred to reduce the
> parallellismHint to minimum to avoid the problem caused by
> https://issues.apache.org/jira/browse/STORM-503


>


> note that we use trident, so if I count all bolts I can see under the
> section "Bolt (All time)" in storm UI, I have 137 bolts (including all
> merges, joins, project...)


>


>


> Eric


>
>
>
> *De :* Harsha <st...@harsha.io> *Envoy� :* 23 juillet 2015 11:14 *� :*
> user@storm.apache.org *Objet :* Re: worker dies after view minutes
>
> Thanks for update Eric. Could you describe how did you find that
> maxTask too high causing this issue. We are trying to improve
> debugging of storm topologies , this will be helpful for us.
> Thanks, Harhsa
>
> On Thu, Jul 23, 2015, at 07:36 AM, Eric Ruel wrote:
>>


>> finally the problem was caused by a maxTask too high
>>
>>
>>
>> *De :* Harsha <st...@harsha.io> *Envoy� :* 22 juillet 2015 10:56 *�
>> :* user@storm.apache.org *Objet :* Re: worker dies after view minutes
>>
>> how is your topology code looks like are you throwing any errors from
>> bolt's execute method?. It does look like there is a RuntimeException
>> happening " *Error when processing event***
>> *java.lang.RuntimeException:* " Its up to the user to catch any
>> exception and log or do something with instead of throwing it back to
>> worker jvm
>>
>> -Harsha
>>
>>
>> On Wed, Jul 22, 2015, at 07:43 AM, Eric Ruel wrote:
>>> Hello


>>>


>>> the workers in my topology dies after 1,2 minutes


>>>


>>> I tried to change the config about the heartbeat, cluster or local
>>> mode, but they always die


>>>


>>> any idea?


>>>


>>> 10:38:38.019 ERROR backtype.storm.daemon.worker - Error when
>>> processing event


>>> java.lang.RuntimeException:
>>> org.apache.storm.zookeeper.KeeperException$ConnectionLossException:
>>> KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-
>>> 68f27054a7b2-1024


>>> at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44) ~[storm-core-
>>> 0.9.6.jar:0.9.6]


>>> at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:173) ~[storm-core-
>>> 0.9.6.jar:0.9.6]


>>> at backtype.storm.cluster$mk_distributed_cluster_state$reify__1919.-
>>> set_data(cluster.clj:92) ~[storm-core-0.9.6.jar:0.9.6]


>>> at backtype.storm.cluster$mk_storm_cluster_state$reify__2376.worker-
>>> _heartbeat_BANG_(cluster.clj:332) ~[storm-core-0.9.6.jar:0.9.6]


>>> at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source)
>>> ~[na:na]


>>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethod-
>>> AccessorImpl.java:43) ~[na:1.7.0_71]


>>> at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]


>>> at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
>>> ~[clojure-1.5.1.jar:na]


>>> at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28)
>>> ~[clojure-1.5.1.jar:na]


>>> at backtype.storm.daemon.worker$do_executor_heartbeats.doInvoke(wor-
>>> ker.clj:56) ~[storm-core-0.9.6.jar:0.9.6]


>>> at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-
>>> 1.5.1.jar:na]


>>> at backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____375-
>>> 8$fn__3761.invoke(worker.clj:413) ~[storm-core-0.9.6.jar:0.9.6]


>>> at backtype.storm.timer$schedule_recurring$this__1704.invoke(timer.-
>>> clj:99) ~[storm-core-0.9.6.jar:0.9.6]


>>> at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj-
>>> :50) ~[storm-core-0.9.6.jar:0.9.6]


>>> at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-
>>> 0.9.6.jar:0.9.6]


>>> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]


>>> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]


>>> Caused by:
>>> org.apache.storm.zookeeper.KeeperException$ConnectionLossException:
>>> KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-
>>> 68f27054a7b2-1024


>>> at org.apache.storm.zookeeper.KeeperException.create(KeeperExceptio-
>>> n.java:99) ~[storm-core-0.9.6.jar:0.9.6]


>>> at org.apache.storm.zookeeper.KeeperException.create(KeeperExceptio-
>>> n.java:51) ~[storm-core-0.9.6.jar:0.9.6]


>>> at org.apache.storm.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270)
>>> ~[storm-core-0.9.6.jar:0.9.6]


>>> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.cal-
>>> l(SetDataBuilderImpl.java:260) ~[storm-core-0.9.6.jar:0.9.6]


>>> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.cal-
>>> l(SetDataBuilderImpl.java:256) ~[storm-core-0.9.6.jar:0.9.6]


>>> at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:-
>>> 107) ~[storm-core-0.9.6.jar:0.9.6]


>>> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.pathI-
>>> nForeground(SetDataBuilderImpl.java:252) ~[storm-core-
>>> 0.9.6.jar:0.9.6]


>>> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPa-
>>> th(SetDataBuilderImpl.java:239) ~[storm-core-0.9.6.jar:0.9.6]


>>> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPa-
>>> th(SetDataBuilderImpl.java:39) ~[storm-core-0.9.6.jar:0.9.6]


>>> at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:172) ~[storm-core-
>>> 0.9.6.jar:0.9.6]


>>> ... 15 common frames omitted


>>> 10:38:38.023 ERROR backtype.storm.util - Halting process: ("Error
>>> when processing an event")


>>> java.lang.RuntimeException: ("Error when processing an event")


>>> at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-
>>> 0.9.6.jar:0.9.6]


>>> at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-
>>> 1.5.1.jar:na]


>>> at backtype.storm.daemon.worker$mk_halting_timer$fn__3572.invoke(wo-
>>> rker.clj:177) [storm-core-0.9.6.jar:0.9.6]


>>> at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj-
>>> :68) [storm-core-0.9.6.jar:0.9.6]


>>> at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-
>>> 0.9.6.jar:0.9.6]


>>> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]


>>> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]


>>>
>>
>

RE: worker dies after view minutes

Posted by Eric Ruel <er...@wantedanalytics.com>.


is there a limit of bolts/threads we can have within a single worker?

our topology has almost 140 bolts including those created by trident, and if I increase the parallelism, the worker dies

is it caused by the process that check if every tasks are alive, and it takes too much time to do the whole loop, or something like that?

is there any values in storm.yaml I should modify to support that number of threads?
________________________________
De : Eric Ruel <er...@wantedanalytics.com>
Envoyé : 23 juillet 2015 13:57
À : user@storm.apache.org
Objet : RE: worker dies after view minutes




Originally, we had multiple topologies single worker with a maxSpoutpending of 3, a parallellimshint between 1 and 20 depending on the bolt,a batch size of 300 and a maxTask of 90


by reducing the batchsize to 2 records, and saw that my workers died always after about ~45 seconds... so I just did many tries by changing values of all parameters


I still don't totally understand the difference between the maxtask and the paralellism hint



I only remember that originally, we preferred to reduce the parallellismHint to minimum to avoid the problem caused by https://issues.apache.org/jira/browse/STORM-503


note that we use trident, so if I count all bolts I can see under the section "Bolt (All time)" in storm UI, I have 137 bolts (including all merges, joins, project...)



Eric


________________________________
De : Harsha <st...@harsha.io>
Envoyé : 23 juillet 2015 11:14
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

Thanks for update Eric. Could you describe how did you find that maxTask too high causing this issue. We are trying to improve debugging of storm topologies , this will be helpful for us.
Thanks,
Harhsa

On Thu, Jul 23, 2015, at 07:36 AM, Eric Ruel wrote:


finally the problem was caused by a maxTask too high


________________________________

De : Harsha <st...@harsha.io>
Envoyé : 22 juillet 2015 10:56
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

how is your topology code looks like are you throwing any errors from bolt's execute method?. It does look like there is a RuntimeException happening
"
Error when processing event
java.lang.RuntimeException:
"
Its up to the user to catch any exception and log or do something with instead of throwing it back to worker jvm

-Harsha


On Wed, Jul 22, 2015, at 07:43 AM, Eric Ruel wrote:

Hello


the workers in my topology dies after 1,2 minutes


I tried to change the config about the heartbeat, cluster or local mode, but they always die


any idea?


10:38:38.019 ERROR backtype.storm.daemon.worker - Error when processing event

java.lang.RuntimeException: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:173) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_distributed_cluster_state$reify__1919.set_data(cluster.clj:92) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_storm_cluster_state$reify__2376.worker_heartbeat_BANG_(cluster.clj:332) ~[storm-core-0.9.6.jar:0.9.6]

at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source) ~[na:na]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_71]

at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]

at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.5.1.jar:na]

at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$do_executor_heartbeats.doInvoke(worker.clj:56) ~[storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____3758$fn__3761.invoke(worker.clj:413) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$schedule_recurring$this__1704.invoke(timer.clj:99) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:50) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

Caused by: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:99) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:51) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:260) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:256) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:252) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:239) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:39) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:172) ~[storm-core-0.9.6.jar:0.9.6]

... 15 common frames omitted

10:38:38.023 ERROR backtype.storm.util - Halting process: ("Error when processing an event")

java.lang.RuntimeException: ("Error when processing an event")

at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$mk_halting_timer$fn__3572.invoke(worker.clj:177) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:68) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

RE: worker dies after view minutes

Posted by Eric Ruel <er...@wantedanalytics.com>.


Originally, we had multiple topologies single worker with a maxSpoutpending of 3, a parallellimshint between 1 and 20 depending on the bolt,a batch size of 300 and a maxTask of 90


by reducing the batchsize to 2 records, and saw that my workers died always after about ~45 seconds... so I just did many tries by changing values of all parameters


I still don't totally understand the difference between the maxtask and the paralellism hint



I only remember that originally, we preferred to reduce the parallellismHint to minimum to avoid the problem caused by https://issues.apache.org/jira/browse/STORM-503


note that we use trident, so if I count all bolts I can see under the section "Bolt (All time)" in storm UI, I have 137 bolts (including all merges, joins, project...)



Eric


________________________________
De : Harsha <st...@harsha.io>
Envoyé : 23 juillet 2015 11:14
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

Thanks for update Eric. Could you describe how did you find that maxTask too high causing this issue. We are trying to improve debugging of storm topologies , this will be helpful for us.
Thanks,
Harhsa

On Thu, Jul 23, 2015, at 07:36 AM, Eric Ruel wrote:


finally the problem was caused by a maxTask too high


________________________________

De : Harsha <st...@harsha.io>
Envoyé : 22 juillet 2015 10:56
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

how is your topology code looks like are you throwing any errors from bolt's execute method?. It does look like there is a RuntimeException happening
"
Error when processing event
java.lang.RuntimeException:
"
Its up to the user to catch any exception and log or do something with instead of throwing it back to worker jvm

-Harsha


On Wed, Jul 22, 2015, at 07:43 AM, Eric Ruel wrote:

Hello


the workers in my topology dies after 1,2 minutes


I tried to change the config about the heartbeat, cluster or local mode, but they always die


any idea?


10:38:38.019 ERROR backtype.storm.daemon.worker - Error when processing event

java.lang.RuntimeException: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:173) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_distributed_cluster_state$reify__1919.set_data(cluster.clj:92) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_storm_cluster_state$reify__2376.worker_heartbeat_BANG_(cluster.clj:332) ~[storm-core-0.9.6.jar:0.9.6]

at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source) ~[na:na]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_71]

at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]

at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.5.1.jar:na]

at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$do_executor_heartbeats.doInvoke(worker.clj:56) ~[storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____3758$fn__3761.invoke(worker.clj:413) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$schedule_recurring$this__1704.invoke(timer.clj:99) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:50) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

Caused by: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:99) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:51) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:260) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:256) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:252) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:239) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:39) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:172) ~[storm-core-0.9.6.jar:0.9.6]

... 15 common frames omitted

10:38:38.023 ERROR backtype.storm.util - Halting process: ("Error when processing an event")

java.lang.RuntimeException: ("Error when processing an event")

at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$mk_halting_timer$fn__3572.invoke(worker.clj:177) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:68) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

Re: worker dies after view minutes

Posted by Harsha <st...@harsha.io>.

Thanks for update Eric. Could you describe how did you find that maxTask
too high causing this issue. We are trying to improve debugging of storm
topologies , this will be helpful for us. Thanks, Harhsa

On Thu, Jul 23, 2015, at 07:36 AM, Eric Ruel wrote:
>


> finally the problem was caused by a maxTask too high
>
>
>
> *De :* Harsha <st...@harsha.io> *Envoy� :* 22 juillet 2015 10:56 *� :*
> user@storm.apache.org *Objet :* Re: worker dies after view minutes
>
> how is your topology code looks like are you throwing any errors from
> bolt's execute method?. It does look like there is a RuntimeException
> happening " *Error when processing event***
> *java.lang.RuntimeException:*
>
"
> Its up to the user to catch any exception and log or do something with
> instead of throwing it back to worker jvm
>
> -Harsha
>
>
> On Wed, Jul 22, 2015, at 07:43 AM, Eric Ruel wrote:
>> Hello


>>


>> the workers in my topology dies after 1,2 minutes


>>


>> I tried to change the config about the heartbeat, cluster or local
>> mode, but they always die


>>


>> any idea?


>>


>> 10:38:38.019 ERROR backtype.storm.daemon.worker - Error when
>> processing event


>> java.lang.RuntimeException:
>> org.apache.storm.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-
>> 68f27054a7b2-1024


>> at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44) ~[storm-core-
>> 0.9.6.jar:0.9.6]


>> at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:173) ~[storm-core-
>> 0.9.6.jar:0.9.6]


>> at backtype.storm.cluster$mk_distributed_cluster_state$reify__1919.s-
>> et_data(cluster.clj:92) ~[storm-core-0.9.6.jar:0.9.6]


>> at backtype.storm.cluster$mk_storm_cluster_state$reify__2376.worker_-
>> heartbeat_BANG_(cluster.clj:332) ~[storm-core-0.9.6.jar:0.9.6]


>> at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source)
>> ~[na:na]


>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodA-
>> ccessorImpl.java:43) ~[na:1.7.0_71]


>> at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]


>> at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-
>> 1.5.1.jar:na]


>> at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) ~[clojure-
>> 1.5.1.jar:na]


>> at backtype.storm.daemon.worker$do_executor_heartbeats.doInvoke(work-
>> er.clj:56) ~[storm-core-0.9.6.jar:0.9.6]


>> at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-
>> 1.5.1.jar:na]


>> at backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____3758-
>> $fn__3761.invoke(worker.clj:413) ~[storm-core-0.9.6.jar:0.9.6]


>> at backtype.storm.timer$schedule_recurring$this__1704.invoke(timer.c-
>> lj:99) ~[storm-core-0.9.6.jar:0.9.6]


>> at
>> backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:50)
>> ~[storm-core-0.9.6.jar:0.9.6]


>> at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-
>> 0.9.6.jar:0.9.6]


>> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]


>> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]


>> Caused by:
>> org.apache.storm.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-
>> 68f27054a7b2-1024


>> at org.apache.storm.zookeeper.KeeperException.create(KeeperException-
>> .java:99) ~[storm-core-0.9.6.jar:0.9.6]


>> at org.apache.storm.zookeeper.KeeperException.create(KeeperException-
>> .java:51) ~[storm-core-0.9.6.jar:0.9.6]


>> at org.apache.storm.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270)
>> ~[storm-core-0.9.6.jar:0.9.6]


>> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call-
>> (SetDataBuilderImpl.java:260) ~[storm-core-0.9.6.jar:0.9.6]


>> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call-
>> (SetDataBuilderImpl.java:256) ~[storm-core-0.9.6.jar:0.9.6]


>> at
>> org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
>> ~[storm-core-0.9.6.jar:0.9.6]


>> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.pathIn-
>> Foreground(SetDataBuilderImpl.java:252) ~[storm-core-0.9.6.jar:0.9.6]


>> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPat-
>> h(SetDataBuilderImpl.java:239) ~[storm-core-0.9.6.jar:0.9.6]


>> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPat-
>> h(SetDataBuilderImpl.java:39) ~[storm-core-0.9.6.jar:0.9.6]


>> at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:172) ~[storm-core-
>> 0.9.6.jar:0.9.6]


>> ... 15 common frames omitted


>> 10:38:38.023 ERROR backtype.storm.util - Halting process: ("Error
>> when processing an event")


>> java.lang.RuntimeException: ("Error when processing an event")


>> at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-
>> 0.9.6.jar:0.9.6]


>> at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]


>> at backtype.storm.daemon.worker$mk_halting_timer$fn__3572.invoke(wor-
>> ker.clj:177) [storm-core-0.9.6.jar:0.9.6]


>> at
>> backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:68)
>> [storm-core-0.9.6.jar:0.9.6]


>> at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-
>> 0.9.6.jar:0.9.6]


>> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]


>> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]


>>
>

RE: worker dies after view minutes

Posted by Eric Ruel <er...@wantedanalytics.com>.

finally the problem was caused by a maxTask too high


________________________________
De : Harsha <st...@harsha.io>
Envoyé : 22 juillet 2015 10:56
À : user@storm.apache.org
Objet : Re: worker dies after view minutes

how is your topology code looks like are you throwing any errors from bolt's execute method?. It does look like there is a RuntimeException happening
"
Error when processing event
java.lang.RuntimeException:
"
Its up to the user to catch any exception and log or do something with instead of throwing it back to worker jvm

-Harsha


On Wed, Jul 22, 2015, at 07:43 AM, Eric Ruel wrote:

Hello


the workers in my topology dies after 1,2 minutes


I tried to change the config about the heartbeat, cluster or local mode, but they always die


any idea?


10:38:38.019 ERROR backtype.storm.daemon.worker - Error when processing event

java.lang.RuntimeException: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:173) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_distributed_cluster_state$reify__1919.set_data(cluster.clj:92) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_storm_cluster_state$reify__2376.worker_heartbeat_BANG_(cluster.clj:332) ~[storm-core-0.9.6.jar:0.9.6]

at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source) ~[na:na]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_71]

at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]

at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.5.1.jar:na]

at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$do_executor_heartbeats.doInvoke(worker.clj:56) ~[storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____3758$fn__3761.invoke(worker.clj:413) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$schedule_recurring$this__1704.invoke(timer.clj:99) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:50) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

Caused by: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:99) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:51) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:260) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:256) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:252) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:239) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:39) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:172) ~[storm-core-0.9.6.jar:0.9.6]

... 15 common frames omitted

10:38:38.023 ERROR backtype.storm.util - Halting process: ("Error when processing an event")

java.lang.RuntimeException: ("Error when processing an event")

at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$mk_halting_timer$fn__3572.invoke(worker.clj:177) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:68) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

Re: worker dies after view minutes

Posted by Harsha <st...@harsha.io>.

how is your topology code looks like are you throwing any errors from
bolt's execute method?. It does look like there is a RuntimeException
happening " *Error when processing event** *
*java.lang.RuntimeException:* " Its up to the user to catch any
exception and log or do something with instead of throwing it back to
worker jvm

-Harsha


On Wed, Jul 22, 2015, at 07:43 AM, Eric Ruel wrote:
> Hello


>


> the workers in my topology dies after 1,2 minutes


>


> I tried to change the config about the heartbeat, cluster or local
> mode, but they always die


>


> any idea?


>


> 10:38:38.019 ERROR backtype.storm.daemon.worker - Error when
> processing event


> java.lang.RuntimeException:
> org.apache.storm.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-
> 68f27054a7b2-1024


> at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44) ~[storm-core-
> 0.9.6.jar:0.9.6]


> at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:173) ~[storm-core-
> 0.9.6.jar:0.9.6]


> at backtype.storm.cluster$mk_distributed_cluster_state$reify__1919.se-
> t_data(cluster.clj:92) ~[storm-core-0.9.6.jar:0.9.6]


> at backtype.storm.cluster$mk_storm_cluster_state$reify__2376.worker_h-
> eartbeat_BANG_(cluster.clj:332) ~[storm-core-0.9.6.jar:0.9.6]


> at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source)
> ~[na:na]


> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAc-
> cessorImpl.java:43) ~[na:1.7.0_71]


> at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]


> at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-
> 1.5.1.jar:na]


> at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) ~[clojure-
> 1.5.1.jar:na]


> at backtype.storm.daemon.worker$do_executor_heartbeats.doInvoke(worke-
> r.clj:56) ~[storm-core-0.9.6.jar:0.9.6]


> at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-1.5.1.jar:na]


> at backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____3758$-
> fn__3761.invoke(worker.clj:413) ~[storm-core-0.9.6.jar:0.9.6]


> at backtype.storm.timer$schedule_recurring$this__1704.invoke(timer.cl-
> j:99) ~[storm-core-0.9.6.jar:0.9.6]


> at
> backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:50)
> ~[storm-core-0.9.6.jar:0.9.6]


> at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-
> 0.9.6.jar:0.9.6]


> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]


> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]


> Caused by:
> org.apache.storm.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-
> 68f27054a7b2-1024


> at org.apache.storm.zookeeper.KeeperException.create(KeeperException.-
> java:99) ~[storm-core-0.9.6.jar:0.9.6]


> at org.apache.storm.zookeeper.KeeperException.create(KeeperException.-
> java:51) ~[storm-core-0.9.6.jar:0.9.6]


> at org.apache.storm.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270)
> ~[storm-core-0.9.6.jar:0.9.6]


> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(-
> SetDataBuilderImpl.java:260) ~[storm-core-0.9.6.jar:0.9.6]


> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(-
> SetDataBuilderImpl.java:256) ~[storm-core-0.9.6.jar:0.9.6]


> at
> org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
> ~[storm-core-0.9.6.jar:0.9.6]


> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.pathInF-
> oreground(SetDataBuilderImpl.java:252) ~[storm-core-0.9.6.jar:0.9.6]


> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath-
> (SetDataBuilderImpl.java:239) ~[storm-core-0.9.6.jar:0.9.6]


> at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath-
> (SetDataBuilderImpl.java:39) ~[storm-core-0.9.6.jar:0.9.6]


> at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:172) ~[storm-core-
> 0.9.6.jar:0.9.6]


> ... 15 common frames omitted


> 10:38:38.023 ERROR backtype.storm.util - Halting process: ("Error when
> processing an event")


> java.lang.RuntimeException: ("Error when processing an event")


> at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-
> 0.9.6.jar:0.9.6]


> at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]


> at backtype.storm.daemon.worker$mk_halting_timer$fn__3572.invoke(work-
> er.clj:177) [storm-core-0.9.6.jar:0.9.6]


> at
> backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:68)
> [storm-core-0.9.6.jar:0.9.6]


> at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-
> 0.9.6.jar:0.9.6]


> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]


> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]


>

RE: worker dies after view minutes

Posted by Eric Ruel <er...@wantedanalytics.com>.

note:

I use storm 0.9.4, and tried the trunk for 0.9.6 and still have the problem

________________________________
De : Eric Ruel
Envoyé : 22 juillet 2015 10:43
À : user@storm.apache.org
Objet : worker dies after view minutes


Hello


the workers in my topology dies after 1,2 minutes


I tried to change the config about the heartbeat, cluster or local mode, but they always die


any idea?


10:38:38.019 ERROR backtype.storm.daemon.worker - Error when processing event

java.lang.RuntimeException: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at backtype.storm.util$wrap_in_runtime.invoke(util.clj:44) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:173) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_distributed_cluster_state$reify__1919.set_data(cluster.clj:92) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.cluster$mk_storm_cluster_state$reify__2376.worker_heartbeat_BANG_(cluster.clj:332) ~[storm-core-0.9.6.jar:0.9.6]

at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source) ~[na:na]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_71]

at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71]

at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.5.1.jar:na]

at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$do_executor_heartbeats.doInvoke(worker.clj:56) ~[storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:439) ~[clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$fn__3757$exec_fn__1163__auto____3758$fn__3761.invoke(worker.clj:413) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$schedule_recurring$this__1704.invoke(timer.clj:99) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:50) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]

Caused by: org.apache.storm.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /workerbeats/testeric-1-1437575782/259f61ae-02a5-4a75-be50-68f27054a7b2-1024

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:99) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:51) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.zookeeper.ZooKeeper.setData(ZooKeeper.java:1270) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:260) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl$4.call(SetDataBuilderImpl.java:256) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:252) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:239) ~[storm-core-0.9.6.jar:0.9.6]

at org.apache.storm.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:39) ~[storm-core-0.9.6.jar:0.9.6]

at backtype.storm.zookeeper$set_data.invoke(zookeeper.clj:172) ~[storm-core-0.9.6.jar:0.9.6]

... 15 common frames omitted

10:38:38.023 ERROR backtype.storm.util - Halting process: ("Error when processing an event")

java.lang.RuntimeException: ("Error when processing an event")

at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]

at backtype.storm.daemon.worker$mk_halting_timer$fn__3572.invoke(worker.clj:177) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687$fn__1688.invoke(timer.clj:68) [storm-core-0.9.6.jar:0.9.6]

at backtype.storm.timer$mk_timer$fn__1687.invoke(timer.clj:42) [storm-core-0.9.6.jar:0.9.6]

at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]