You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Bijan Fahimi Shemrani (JIRA)" <ji...@apache.org> on 2017/08/25 10:55:00 UTC

[jira] [Created] (STORM-2706) Nimbus stuck in exception and does not fail fast

Bijan Fahimi Shemrani created STORM-2706:
--------------------------------------------

             Summary: Nimbus stuck in exception and does not fail fast
                 Key: STORM-2706
                 URL: https://issues.apache.org/jira/browse/STORM-2706
             Project: Apache Storm
          Issue Type: Bug
    Affects Versions: 1.1.1
            Reporter: Bijan Fahimi Shemrani


We experience a problem in nimbus which leads it to get stuck in a retry and fail loop. When I manually restart the nimbus it works again as expected. However, it would be great if nimbus would shut down so our monitoring can automatically restart the nimbus. 

The nimbus log. 

{noformat}
24.8.2017 15:39:1913:39:19.804 [pool-13-thread-51] ERROR org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer - Unexpected throwable while invoking!
24.8.2017 15:39:19org.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /storm/leader-lock
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:111) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1590) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:230) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:219) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:216) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:207) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:40) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.curator.framework.recipes.locks.LockInternals.getSortedChildren(LockInternals.java:151) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.curator.framework.recipes.locks.LockInternals.getParticipantNodes(LockInternals.java:133) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.shade.org.apache.curator.framework.recipes.leader.LeaderLatch.getLeader(LeaderLatch.java:453) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source) ~[?:?]
24.8.2017 15:39:19	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131]
24.8.2017 15:39:19	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131]
24.8.2017 15:39:19	at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.7.0.jar:?]
24.8.2017 15:39:19	at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313) ~[clojure-1.7.0.jar:?]
24.8.2017 15:39:19	at org.apache.storm.zookeeper$zk_leader_elector$reify__1043.getLeader(zookeeper.clj:296) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) ~[?:?]
24.8.2017 15:39:19	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131]
24.8.2017 15:39:19	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131]
24.8.2017 15:39:19	at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.7.0.jar:?]
24.8.2017 15:39:19	at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313) ~[clojure-1.7.0.jar:?]
24.8.2017 15:39:19	at org.apache.storm.daemon.nimbus$mk_reified_nimbus$reify__10780.getLeader(nimbus.clj:2412) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.generated.Nimbus$Processor$getLeader.getResult(Nimbus.java:3944) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.generated.Nimbus$Processor$getLeader.getResult(Nimbus.java:3928) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:162) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
24.8.2017 15:39:19	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
24.8.2017 15:39:19	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
24.8.2017 15:39:2713:39:27.205 [pool-13-thread-52] ERROR org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer - Unexpected throwable while invoking!
24.8.2017 15:39:27org.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /storm/leader-lock
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:111) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1590) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:230) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:219) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:216) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:207) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:40) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.curator.framework.recipes.locks.LockInternals.getSortedChildren(LockInternals.java:151) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.curator.framework.recipes.locks.LockInternals.getParticipantNodes(LockInternals.java:133) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.shade.org.apache.curator.framework.recipes.leader.LeaderLatch.getLeader(LeaderLatch.java:453) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source) ~[?:?]
24.8.2017 15:39:27	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131]
24.8.2017 15:39:27	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131]
24.8.2017 15:39:27	at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.7.0.jar:?]
24.8.2017 15:39:27	at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313) ~[clojure-1.7.0.jar:?]
24.8.2017 15:39:27	at org.apache.storm.zookeeper$zk_leader_elector$reify__1043.getLeader(zookeeper.clj:296) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) ~[?:?]
24.8.2017 15:39:27	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_131]
24.8.2017 15:39:27	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_131]
24.8.2017 15:39:27	at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.7.0.jar:?]
24.8.2017 15:39:27	at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313) ~[clojure-1.7.0.jar:?]
24.8.2017 15:39:27	at org.apache.storm.daemon.nimbus$get_cluster_info.invoke(nimbus.clj:1544) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.daemon.nimbus$mk_reified_nimbus$reify__10780.getClusterInfo(nimbus.clj:2006) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.generated.Nimbus$Processor$getClusterInfo.getResult(Nimbus.java:3920) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.generated.Nimbus$Processor$getClusterInfo.getResult(Nimbus.java:3904) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:162) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
24.8.2017 15:39:27	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
24.8.2017 15:39:27	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
24.8.2017 15:39:2913:39:29.270 [timer] INFO  org.apache.storm.daemon.nimbus - not a leader, skipping assignments
24.8.2017 15:39:2913:39:29.270 [timer] INFO  org.apache.storm.daemon.nimbus - not a leader, skipping cleanup
24.8.2017 15:39:3913:39:39.270 [timer] INFO  org.apache.storm.daemon.nimbus - not a leader, skipping assignments
24.8.2017 15:39:3913:39:39.270 [timer] INFO  org.apache.storm.daemon.nimbus - not a leader, skipping cleanup
24.8.2017 15:39:4913:39:49.271 [timer] INFO  org.apache.storm.daemon.nimbus - not a leader, skipping assignments
24.8.2017 15:39:4913:39:49.272 [timer] INFO  org.apache.storm.daemon.nimbus - not a leader, skipping cleanup
24.8.2017 15:39:5913:39:59.272 [timer] INFO  org.apache.storm.daemon.nimbus - not a leader, skipping assignments
24.8.2017 15:39:5913:39:59.272 [timer] INFO  org.apache.storm.daemon.nimbus - not a leader, skipping cleanup
24.8.2017 15:40:0913:40:09.272 [timer] INFO  org.apache.storm.daemon.nimbus - not a leader, skipping assignments
24.8.2017 15:40:0913:40:09.272 [timer] INFO  org.apache.storm.daemon.nimbus - not a leader, skipping cleanup
24.8.2017 15:40:1313:40:13.806 [timer] INFO  org.apache.storm.shade.org.apache.curator.framework.imps.CuratorFrameworkImpl - Starting
24.8.2017 15:40:1313:40:13.807 [timer] INFO  org.apache.storm.shade.org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=zookeeper:2181/storm sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState@f90354
24.8.2017 15:40:1313:40:13.808 [timer-SendThread(10.42.174.214:2181)] INFO  org.apache.storm.shade.org.apache.zookeeper.ClientCnxn - Opening socket connection to server 10.42.174.214/10.42.174.214:2181. Will not attempt to authenticate using SASL (unknown error)
24.8.2017 15:40:1313:40:13.862 [timer-SendThread(10.42.174.214:2181)] INFO  org.apache.storm.shade.org.apache.zookeeper.ClientCnxn - Socket connection established to 10.42.174.214/10.42.174.214:2181, initiating session
24.8.2017 15:40:1313:40:13.865 [timer-SendThread(10.42.174.214:2181)] INFO  org.apache.storm.shade.org.apache.zookeeper.ClientCnxn - Session establishment complete on server 10.42.174.214/10.42.174.214:2181, sessionid = 0x15e14456dc70045, negotiated timeout = 20000
24.8.2017 15:40:1313:40:13.910 [timer] INFO  org.apache.storm.shade.org.apache.zookeeper.ZooKeeper - Session: 0x15e14456dc70045 closed
24.8.2017 15:40:1313:40:13.910 [timer-EventThread] INFO  org.apache.storm.shade.org.apache.zookeeper.ClientCnxn - EventThread shut down
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)