You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "Casey Stella (JIRA)" <ji...@apache.org> on 2016/11/02 18:13:58 UTC

[jira] [Commented] (METRON-261) Storm Supervisors Fail to Start

    [ https://issues.apache.org/jira/browse/METRON-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629876#comment-15629876 ] 

Casey Stella commented on METRON-261:
-------------------------------------

Is this still happening, Nick?  I haven't experienced it yet.

> Storm Supervisors Fail to Start
> -------------------------------
>
>                 Key: METRON-261
>                 URL: https://issues.apache.org/jira/browse/METRON-261
>             Project: Metron
>          Issue Type: Bug
>            Reporter: Nick Allen
>            Priority: Minor
>              Labels: platform
>             Fix For: 0.2.1BETA
>
>
> After deployment completes, the Storm Supervisors often fail to start correctly.  This prevents any data from being ingested until the Supervisors are manually started.  
> It appears that the Supervisors fail to communicate with Zookeeper and they timeout and die.  Zookeeper may just not be ready in time.  Not sure if this is something we can fix or if this is an Ambari issue.
> 2016-06-25 12:48:16.448 o.a.s.z.ClientCnxn [WARN] Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_40]
>         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_40]
>         at org.apache.storm.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
> 2016-06-25 12:48:17.154 o.a.s.c.ConnectionState [ERROR] Connection timed out for connection string (ec2-52-41-178-50.us-west-2.compute.amazonaws.com:2181) and timeout (15000) / elapsed (15053)
> org.apache.storm.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss
>         at org.apache.storm.curator.ConnectionState.checkTimeouts(ConnectionState.java:195) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.curator.ConnectionState.getZooKeeper(ConnectionState.java:87) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:487) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:226) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:215) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.curator.framework.imps.ExistsBuilderImpl.pathInForegroundStandard(ExistsBuilderImpl.java:212) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:205) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:168) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at org.apache.storm.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:39) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at backtype.storm.zookeeper$exists_node_QMARK_$fn__3211.invoke(zookeeper.clj:107) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at backtype.storm.zookeeper$exists_node_QMARK_.invoke(zookeeper.clj:104) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at backtype.storm.zookeeper$mkdirs.invoke(zookeeper.clj:120) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at backtype.storm.cluster$mk_distributed_cluster_state.doInvoke(cluster.clj:60) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at clojure.lang.RestFn.invoke(RestFn.java:486) [clojure-1.6.0.jar:?]
>         at backtype.storm.cluster$mk_storm_cluster_state.doInvoke(cluster.clj:314) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at clojure.lang.RestFn.invoke(RestFn.java:439) [clojure-1.6.0.jar:?]
>         at backtype.storm.daemon.supervisor$supervisor_data.invoke(supervisor.clj:296) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at backtype.storm.daemon.supervisor$fn__8449$exec_fn__3614__auto____8450.invoke(supervisor.clj:504) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at clojure.lang.AFn.applyToHelper(AFn.java:160) [clojure-1.6.0.jar:?]
>         at clojure.lang.AFn.applyTo(AFn.java:144) [clojure-1.6.0.jar:?]
>         at clojure.core$apply.invoke(core.clj:624) [clojure-1.6.0.jar:?]
>         at backtype.storm.daemon.supervisor$fn__8449$mk_supervisor__8476.doInvoke(supervisor.clj:500) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at clojure.lang.RestFn.invoke(RestFn.java:436) [clojure-1.6.0.jar:?]
>         at backtype.storm.daemon.supervisor$_launch.invoke(supervisor.clj:792) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at backtype.storm.daemon.supervisor$_main.invoke(supervisor.clj:822) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]
>         at clojure.lang.AFn.applyToHelper(AFn.java:152) [clojure-1.6.0.jar:?]
>         at clojure.lang.AFn.applyTo(AFn.java:144) [clojure-1.6.0.jar:?]
>         at backtype.storm.daemon.supervisor.main(Unknown Source) [storm-core-0.10.0.2.3.4.7-4.jar:0.10.0.2.3.4.7-4]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)