You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "James Xu (JIRA)" <ji...@apache.org> on 2013/12/15 06:14:06 UTC

[jira] [Created] (STORM-131) Intermittent Zookeper errors when shutting down local Topology

James Xu created STORM-131:
------------------------------

             Summary: Intermittent Zookeper errors when shutting down local Topology
                 Key: STORM-131
                 URL: https://issues.apache.org/jira/browse/STORM-131
             Project: Apache Storm (Incubating)
          Issue Type: Bug
            Reporter: James Xu
            Priority: Minor


https://github.com/nathanmarz/storm/issues/259

We have a great deal of Storm integration tests in our project (Storm version 0.7.3) using local topology. We have only one topology operational at any moment in time. As tests run they are organized in groups. Each group works within the boundaries of a topology. When the tests finish executing they shutdown their local cluster, then the new group of tests launches its own cluster.

We see with some remarkable regularity failures related to, what looks like, incorrect Zookeeper shutdown, which leads to a JVM exit (which is a disaster as no test information is recorded at the end). Here is what we see in the main error log (log level: WARN and higher):

2012-07-07 00:22:58,420 WARN [ConnectionStateManager-0|]@jenkins com.netflix.curator.framework.state.ConnectionStateManager
=> There are no ConnectionStateListeners registered.

2012-07-07 00:22:58,534 WARN [Thread-23-EventThread|]@jenkins backtype.storm.cluster
=> Received event :disconnected::none: with disconnected Zookeeper.

2012-07-07 00:23:00,013 WARN [Thread-23-SendThread(localhost:2000)|]@jenkins org.apache.zookeeper.ClientCnxn
=> Session 0x1385ece8f1b0017 for server null, unexpected error, closing socket connection and attempting reconnect

java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
2012-07-07 00:23:01,527 WARN [Thread-23-SendThread(localhost:2000)|]@jenkins org.apache.zookeeper.ClientCnxn
=> Session 0x1385ece8f1b0017 for server null, unexpected error, closing socket connection and attempting reconnect

java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
2012-07-07 00:23:03,510 WARN [Thread-23-SendThread(localhost:2000)|]@jenkins org.apache.zookeeper.ClientCnxn
=> Session 0x1385ece8f1b0017 for server null, unexpected error, closing socket connection and attempting reconnect

java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
2012-07-07 00:23:04,687 WARN [Thread-23-SendThread(localhost:2000)|]@jenkins org.apache.zookeeper.ClientCnxn
=> Session 0x1385ece8f1b0017 for server null, unexpected error, closing socket connection and attempting reconnect

java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
2012-07-07 00:23:05,961 WARN [Thread-23-SendThread(localhost:2000)|]@jenkins org.apache.zookeeper.ClientCnxn
=> Session 0x1385ece8f1b0017 for server null, unexpected error, closing socket connection and attempting reconnect

java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
2012-07-07 00:23:07,588 WARN [Thread-23-SendThread(localhost:2000)|]@jenkins org.apache.zookeeper.ClientCnxn
=> Session 0x1385ece8f1b0017 for server null, unexpected error, closing socket connection and attempting reconnect

java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1119)
2012-07-07 00:23:07,691 ERROR [Thread-23-EventThread|]@jenkins com.netflix.curator.framework.imps.CuratorFrameworkImpl
=> Background operation retry gave up

org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at com.netflix.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:380)
at com.netflix.curator.framework.imps.BackgroundSyncImpl$1.processResult(BackgroundSyncImpl.java:49)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:617)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
2012-07-07 00:23:07,697 WARN [ConnectionStateManager-0|]@jenkins com.netflix.curator.framework.state.ConnectionStateManager
=> There are no ConnectionStateListeners registered.

2012-07-07 00:23:07,699 ERROR [Thread-23-EventThread|]@jenkins backtype.storm.zookeeper
=> Unrecoverable Zookeeper error Background operation retry gave up

org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at com.netflix.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:380)
at com.netflix.curator.framework.imps.BackgroundSyncImpl$1.processResult(BackgroundSyncImpl.java:49)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:617)

at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
And here is what we see in Storm dedicated log file (log level: DEBUG):

2012-07-07 00:22:58,306 INFO [main|]@jenkins backtype.storm.daemon.task
=> Shut down task TLTopology-1-1341620393:31

2012-07-07 00:22:58,306 INFO [main|]@jenkins backtype.storm.messaging.loader
=> Shutting down receiving-thread: [TLTopology-1-1341620393, 5]

2012-07-07 00:22:58,307 INFO [main|]@jenkins backtype.storm.messaging.loader
=> Waiting for receiving-thread:[TLTopology-1-1341620393, 5] to die

2012-07-07 00:22:58,307 INFO [Thread-319|]@jenkins backtype.storm.messaging.loader
=> Receiving-thread:[TLTopology-1-1341620393, 5] received shutdown notice

2012-07-07 00:22:58,307 INFO [main|]@jenkins backtype.storm.messaging.loader
=> Shutdown receiving-thread: [TLTopology-1-1341620393, 5]

2012-07-07 00:22:58,307 INFO [main|]@jenkins backtype.storm.daemon.worker
=> Terminating zmq context

2012-07-07 00:22:58,307 INFO [main|]@jenkins backtype.storm.daemon.worker
=> Waiting for threads to die

2012-07-07 00:22:58,307 INFO [Thread-318|]@jenkins backtype.storm.util
=> Async loop interrupted!

2012-07-07 00:22:58,309 INFO [main|]@jenkins backtype.storm.daemon.worker
=> Disconnecting from storm cluster state context

2012-07-07 00:22:58,311 INFO [main|]@jenkins backtype.storm.daemon.worker
=> Shut down worker TLTopology-1-1341620393 96e12303-4c22-4821-9f3b-3bce2230bf08 5

2012-07-07 00:22:58,311 DEBUG [main|]@jenkins backtype.storm.util
=> Rmr path /tmp/f308eb0e-2e72-4221-9620-43e15a9c1bdc/workers/16966f32-d0d4-4ee1-a0fe-1d85fc4a478e/heartbeats

2012-07-07 00:22:58,313 DEBUG [main|]@jenkins backtype.storm.util
=> Removing path /tmp/f308eb0e-2e72-4221-9620-43e15a9c1bdc/workers/16966f32-d0d4-4ee1-a0fe-1d85fc4a478e/pids

2012-07-07 00:22:58,313 DEBUG [main|]@jenkins backtype.storm.util
=> Removing path /tmp/f308eb0e-2e72-4221-9620-43e15a9c1bdc/workers/16966f32-d0d4-4ee1-a0fe-1d85fc4a478e

2012-07-07 00:22:58,313 INFO [main|]@jenkins backtype.storm.daemon.supervisor
=> Shut down 96e12303-4c22-4821-9f3b-3bce2230bf08:16966f32-d0d4-4ee1-a0fe-1d85fc4a478e

2012-07-07 00:22:58,314 INFO [main|]@jenkins backtype.storm.daemon.supervisor
=> Shutting down supervisor 96e12303-4c22-4821-9f3b-3bce2230bf08

2012-07-07 00:22:58,314 INFO [Thread-25|]@jenkins backtype.storm.event
=> Event manager interrupted

2012-07-07 00:22:58,315 INFO [Thread-26|]@jenkins backtype.storm.event
=> Event manager interrupted

2012-07-07 00:22:58,318 INFO [main|]@jenkins backtype.storm.testing
=> Shutting down in process zookeeper

2012-07-07 00:22:58,321 INFO [main|]@jenkins backtype.storm.testing
=> Done shutting down in process zookeeper

2012-07-07 00:22:58,321 INFO [main|]@jenkins backtype.storm.testing
=> Deleting temporary path /tmp/0202cf11-6ad7-4dda-94d6-622a63c9f6b6

2012-07-07 00:22:58,321 DEBUG [main|]@jenkins backtype.storm.util
=> Rmr path /tmp/0202cf11-6ad7-4dda-94d6-622a63c9f6b6

2012-07-07 00:22:58,322 INFO [main|]@jenkins backtype.storm.testing
=> Deleting temporary path /tmp/ee47e3e3-752f-40a8-b6a9-a197a9dda3de

2012-07-07 00:22:58,323 DEBUG [main|]@jenkins backtype.storm.util
=> Rmr path /tmp/ee47e3e3-752f-40a8-b6a9-a197a9dda3de

2012-07-07 00:22:58,323 INFO [main|]@jenkins backtype.storm.testing
=> Deleting temporary path /tmp/ece72b84-357e-4183-aeb5-e0d2dc5d6eca

2012-07-07 00:22:58,323 DEBUG [main|]@jenkins backtype.storm.util
=> Rmr path /tmp/ece72b84-357e-4183-aeb5-e0d2dc5d6eca

2012-07-07 00:22:58,326 INFO [main|]@jenkins backtype.storm.testing
=> Deleting temporary path /tmp/f308eb0e-2e72-4221-9620-43e15a9c1bdc

2012-07-07 00:22:58,326 DEBUG [main|]@jenkins backtype.storm.util
=> Rmr path /tmp/f308eb0e-2e72-4221-9620-43e15a9c1bdc

2012-07-07 00:22:58,534 WARN [Thread-23-EventThread|]@jenkins backtype.storm.cluster
=> Received event :disconnected::none: with disconnected Zookeeper.

2012-07-07 00:23:07,699 ERROR [Thread-23-EventThread|]@jenkins backtype.storm.zookeeper
=> Unrecoverable Zookeeper error Background operation retry gave up

org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at com.netflix.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:380)
at com.netflix.curator.framework.imps.BackgroundSyncImpl$1.processResult(BackgroundSyncImpl.java:49)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:617)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
2012-07-07 00:23:07,702 INFO [Thread-23-EventThread|]@jenkins backtype.storm.util

=> Halting process: ("Unrecoverable Zookeeper error")
It seems like a threading issue to me personally. I wonder if there is some form of workaround. I also understand that since this is a "local" topology issue, this might not receive due attention... However, fundamentally this is what new users would start with when they begin to play with Storm, and, I think, it is important to make this experience positive.

Nathan, thank you very much for everything that you're doing.

-Kyrill

----------
dkincaid: Looking through the shutdown code for local clusters I noticed a comment in the code about a possible race condition. I'm wondering if we could be running into this on our Jenkins server (which we know runs pretty slowly). Is a worker getting restarted before the supervisor can be shutdown?

Here is the function with the comment:

(defn kill-local-storm-cluster [cluster-map]
  (.shutdown (:nimbus cluster-map))
  (.close (:state cluster-map))
  (.disconnect (:storm-cluster-state cluster-map))
  (doseq [s @(:supervisors cluster-map)]
    (.shutdown-all-workers s)
    ;; race condition here? will it launch the workers again?
    (supervisor/kill-supervisor s))
  (psim/kill-all-processes)
  (log-message "Shutting down in process zookeeper")
  (zk/shutdown-inprocess-zookeeper (:zookeeper cluster-map))
  (log-message "Done shutting down in process zookeeper")
  (doseq [t @(:tmp-dirs cluster-map)]
    (log-message "Deleting temporary path " t)
    (rmr t)
    ))


--------
kyrill007: Fantastic catch, Dave!!! This exactly what is happening: supervisor begins launching new workers when the other ones are still being shut down. Here is the proof from the logs:

Shut down process is initiated at 04:37:05,136.

2012-07-11 04:37:05,136 INFO [main|]@jenkins backtype.storm.daemon.nimbus
  => Shutting down master

2012-07-11 04:37:05,145 INFO [main|]@jenkins backtype.storm.daemon.nimbus
  => Shut down master

2012-07-11 04:37:05,151 INFO [main|]@jenkins backtype.storm.daemon.supervisor
  => Shutting down 5c48d4fc-769f-41ef-abd6-f92df60fa543:12eba15d-fb17-4a3c-8e25-1c0266eed04d

2012-07-11 04:37:05,152 INFO [main|]@jenkins backtype.storm.process-simulator
  => Killing process ea132b37-dc6a-447c-b1de-ac6727c82cef

2012-07-11 04:37:05,152 INFO [main|]@jenkins backtype.storm.daemon.worker
  => Shutting down worker TLTopology-1-1341981237 5c48d4fc-769f-41ef-abd6-f92df60fa543 1

2012-07-11 04:37:05,152 INFO [main|]@jenkins backtype.storm.daemon.task
  => Shutting down task TLTopology-1-1341981237:64

2012-07-11 04:37:05,153 INFO [Thread-129|]@jenkins backtype.storm.util
  => Async loop interrupted!

2012-07-11 04:37:05,180 INFO [main|]@jenkins backtype.storm.daemon.task
  => Shut down task TLTopology-1-1341981237:64

2012-07-11 04:37:05,180 INFO [main|]@jenkins backtype.storm.daemon.task
  => Shutting down task TLTopology-1-1341981237:34
It continues for a while (we have a lot of workers). Then at 04:37:05,665 we start seeing this:

012-07-11 04:37:05,665 DEBUG [Thread-19|]@jenkins backtype.storm.daemon.supervisor
  => Assigned tasks: {2 #backtype.storm.daemon.supervisor.LocalAssignment{:storm-id "TLTopology-1-1341981237", :task-ids (96 66 36 6 102 72 42 12 108 78 48 18 114 84 54 24 120 90 60 30 126)}, 1 #backtype.storm.daemon.supervisor.LocalAssignment{:storm-id "TLTopology-1-1341981237", :task-ids (64 34 4 100 70 40 10 106 76 46 16 112 82 52 22 118 88 58 28 124 94)}, 3 #backtype.storm.daemon.supervisor.LocalAssignment{:storm-id "TLTopology-1-1341981237", :task-ids (32 2 98 68 38 8 104 74 44 14 110 80 50 20 116 86 56 26 122 92 62)}}

2012-07-11 04:37:05,665 DEBUG [Thread-19|]@jenkins backtype.storm.daemon.supervisor
  => Allocated: {"a724dc19-84ec-46dc-9768-afb73df94237" [:valid #backtype.storm.daemon.common.WorkerHeartbeat{:time-secs 1341981425, :storm-id "TLTopology-1-1341981237", :task-ids #{96 66 36 6 102 72 42 12 108 78 48 18 114 84 54 24 120 90 60 30 126}, :port 2}]}

2012-07-11 04:37:05,665 DEBUG [Thread-19|]@jenkins backtype.storm.util
  => Making dirs at /tmp/4884ffb5-c6c7-43a9-ac72-e0a5426eea3c/workers/a7f81ea0-a5f6-47de-9a89-47998b1e1639/pids

2012-07-11 04:37:05,666 DEBUG [Thread-19|]@jenkins backtype.storm.util
  => Making dirs at /tmp/4884ffb5-c6c7-43a9-ac72-e0a5426eea3c/workers/1b5c4c87-4e05-4cab-a580-ae1dabb3fd2e/pids

2012-07-11 04:37:05,666 INFO [main|]@jenkins backtype.storm.daemon.worker
  => Shut down worker TLTopology-1-1341981237 5c48d4fc-769f-41ef-abd6-f92df60fa543 2

2012-07-11 04:37:05,667 DEBUG [main|]@jenkins backtype.storm.util
  => Rmr path /tmp/4884ffb5-c6c7-43a9-ac72-e0a5426eea3c/workers/a724dc19-84ec-46dc-9768-afb73df94237/heartbeats

2012-07-11 04:37:05,669 DEBUG [main|]@jenkins backtype.storm.util
  => Removing path /tmp/4884ffb5-c6c7-43a9-ac72-e0a5426eea3c/workers/a724dc19-84ec-46dc-9768-afb73df94237/pids

2012-07-11 04:37:05,669 DEBUG [main|]@jenkins backtype.storm.util
  => Removing path /tmp/4884ffb5-c6c7-43a9-ac72-e0a5426eea3c/workers/a724dc19-84ec-46dc-9768-afb73df94237

2012-07-11 04:37:05,669 INFO [main|]@jenkins backtype.storm.daemon.supervisor
  => Shut down 5c48d4fc-769f-41ef-abd6-f92df60fa543:a724dc19-84ec-46dc-9768-afb73df94237

2012-07-11 04:37:05,669 INFO [main|]@jenkins backtype.storm.daemon.supervisor
  => Shutting down supervisor 5c48d4fc-769f-41ef-abd6-f92df60fa543

2012-07-11 04:37:05,670 INFO [Thread-18|]@jenkins backtype.storm.event
  => Event manager interrupted

2012-07-11 04:37:05,670 INFO [Thread-19|]@jenkins backtype.storm.daemon.supervisor
  => Launching worker with assignment #backtype.storm.daemon.supervisor.LocalAssignment{:storm-id "TLTopology-1-1341981237", :task-ids (64 34 4 100 70 40 10 106 76 46 16 112 82 52 22 118 88 58 28 124 94)} for this supervisor 5c48d4fc-769f-41ef-abd6-f92df60fa543 on port 1 with id a7f81ea0-a5f6-47de-9a89-47998b1e1639

2012-07-11 04:37:05,672 INFO [Thread-19|]@jenkins backtype.storm.daemon.worker
  => Launching worker for TLTopology-1-1341981237 on 5c48d4fc-769f-41ef-abd6-f92df60fa543:1 with id a7f81ea0-a5f6-47de-9a89-47998b1e1639 and conf {"dev.zookeeper.path" "/tmp/dev-storm-zookeeper", "topology.fall.back.on.java.serialization" true, "zmq.linger.millis" 0, "topology.skip.missing.kryo.registrations" true, "ui.childopts" "-Xmx768m", "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true, "nimbus.monitor.freq.secs" 10, "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.local.dir" "/tmp/4884ffb5-c6c7-43a9-ac72-e0a5426eea3c", "supervisor.worker.start.timeout.secs" 120, "nimbus.cleanup.inbox.freq.secs" 600, "nimbus.inbox.jar.expiration.secs" 3600, "nimbus.host" "localhost", "storm.zookeeper.port" 2000, "transactional.zookeeper.port" nil, "transactional.zookeeper.servers" nil, "storm.zookeeper.root" "/storm", "supervisor.enable" true, "storm.zookeeper.servers" ["localhost"], "transactional.zookeeper.root" "/transactional", "topology.worker.childopts" nil, "worker.childopts" "-Xmx768m", "supervisor.heartbeat.frequency.secs" 5, "drpc.port" 3772, "supervisor.monitor.frequency.secs" 3, "task.heartbeat.frequency.secs" 3, "topology.max.spout.pending" nil, "storm.zookeeper.retry.interval" 1000, "supervisor.slots.ports" (1 2 3), "topology.debug" false, "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60, "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10, "topology.workers" 1, "supervisor.childopts" "-Xmx1024m", "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05, "worker.heartbeat.frequency.secs" 1, "nimbus.task.timeout.secs" 30, "drpc.invocations.port" 3773, "zmq.threads" 1, "storm.zookeeper.retry.times" 5, "topology.state.synchronization.timeout.secs" 60, "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs" 600, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false, "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "topology.ackers" 1, "storm.cluster.mode" "local", "topology.optimize" true, "topology.max.task.parallelism" nil}

2012-07-11 04:37:05,675 INFO [Thread-19|]@jenkins backtype.storm.event
  => Event manager interrupted

2012-07-11 04:37:05,677 INFO [Thread-19-EventThread|]@jenkins backtype.storm.zookeeper
  => Zookeeper state update: :connected:none
which at the end result in this:

2012-07-11 04:37:06,175 INFO [Thread-19-EventThread|]@jenkins backtype.storm.zookeeper
  => Zookeeper state update: :disconnected:none

2012-07-11 04:37:06,175 WARN [Thread-22-EventThread|]@jenkins backtype.storm.cluster
  => Received event :disconnected::none: with disconnected Zookeeper.

2012-07-11 04:37:15,923 ERROR [Thread-22-EventThread|]@jenkins backtype.storm.zookeeper
  => Unrecoverable Zookeeper error Background operation retry gave up

org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
    at com.netflix.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:380)
    at com.netflix.curator.framework.imps.BackgroundSyncImpl$1.processResult(BackgroundSyncImpl.java:49)
    at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:613)
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)
2012-07-11 04:37:15,926 INFO [Thread-22-EventThread|]@jenkins backtype.storm.util
  => Halting process: ("Unrecoverable Zookeeper error")
Dear Nathan,

If this race condition could somehow be fixed (presumably it is not that hard since we know what the problem is), it would so much appreciated!!!




--
This message was sent by Atlassian JIRA
(v6.1.4#6159)