You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Jonathan Hurley (JIRA)" <ji...@apache.org> on 2016/01/27 22:30:39 UTC

[jira] [Created] (AMBARI-14819) RU : Storm Topologies stopped running while rolling upgrade

Jonathan Hurley created AMBARI-14819:
----------------------------------------

             Summary: RU : Storm Topologies stopped running while rolling upgrade
                 Key: AMBARI-14819
                 URL: https://issues.apache.org/jira/browse/AMBARI-14819
             Project: Ambari
          Issue Type: Bug
    Affects Versions: 2.2.0
            Reporter: Jonathan Hurley
            Assignee: Jonathan Hurley
            Priority: Blocker
             Fix For: 2.2.2


When performing a rolling upgrade from HDP 2.3 to 2.4, Storm topologies are stopped.

1) Start HDFS topology and Hive topology before Rolling upgrade starts
{code:title=HDFS topology}
2016-01-22 16:35:40,011|beaver.component.rollingupgrade.ruCommon|INFO|28499|139893976106752|MainThread|Running long running background jobs for storm.
2016-01-22 16:35:40,015|beaver.machine|INFO|28499|139893976106752|MainThread|RUNNING: /usr/hdp/current/storm-client/bin/storm -c java.security.auth.login.config=/etc/storm/conf/client_jaas.conf -c storm.thrift.transport=backtype.storm.security.auth.kerberos.KerberosSaslTransportPlugin jar /grid/0/hadoopqe/artifacts/storm-hdfs-tests/target/storm-integration-test-1.0-SNAPSHOT.jar org.apache.storm.hdfs.bolt.HdfsFileTopology hdfs://nameservice/tmp /tmp/hdfs-conf.yaml HDFSTopology
{code}
{code:title=Hive Topology}
2016-01-22 16:37:24,486|beaver.machine|INFO|28499|139893976106752|MainThread|RUNNING: /usr/hdp/current/storm-client/bin/storm -c java.security.auth.login.config=/etc/storm/conf/client_jaas.conf -c storm.thrift.transport=backtype.storm.security.auth.kerberos.KerberosSaslTransportPlugin jar /grid/0/hadoopqe/artifacts/storm-hive-tests/target/storm-integration-test-1.0-SNAPSHOT.jar org.apache.storm.hive.bolt.HiveTopologyPartitioned thrift://os-d7-gkzzqs-rudalm10todalnextsecha-1.novalocal:9083,thrift://os-d7-gkzzqs-rudalm10todalnextsecha-10.novalocal:9083,thrift://os-d7-gkzzqs-rudalm10todalnextsecha-10.novalocal:9083,thrift://os-d7-gkzzqs-rudalm10todalnextsecha-11.novalocal:9083 stormdb userdata HiveTopology /home/hrt_qa/hadoopqa/keytabs/hrt_qa.headless.keytab hrt_qa@EXAMPLE.COM
{code}
2) Make sure it runs through out the Rolling upgrade.
3) Validate if it was running fine. 

Here, While upgrading from 2.3.2.0-2950 to  2.4.0.0-128, All storm topologies stopped. 
I see below stack trace for HDFS Topology worker node.
http://qelog.hortonworks.com/log/os-d7-gkzzqs-rudalm10todalnextsecha/service-logs/storm/172.22.103.85/HDFSTopology-1-1453480559-worker-6701.log
{code}
2016-01-22 19:41:04.084 b.s.d.executor [ERROR] 
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /errors/HDFSTopology-1-1453480559/my-bolt-last-error
	at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.daemon.executor$fn__6099$fn__6112$fn__6163.invoke(executor.clj:808) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.util$async_loop$fn__543.invoke(util.clj:475) [storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:?]
	at java.lang.Thread.run(Thread.java:745) [?:1.7.0_67]
Caused by: java.lang.RuntimeException: org.apache.storm.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /errors/HDFSTopology-1-1453480559/my-bolt-last-error
	at backtype.storm.util$wrap_in_runtime.invoke(util.clj:48) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.zookeeper$create_node.invoke(zookeeper.clj:97) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.cluster$mk_distributed_cluster_state$reify__4937.set_data(cluster.clj:110) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.cluster$mk_storm_cluster_state$reify__5557.report_error(cluster.clj:537) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.daemon.executor$throttled_report_error_fn$fn__5878.invoke(executor.clj:193) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.daemon.executor$fn__6099$fn$reify__6147.reportError(executor.clj:798) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.task.OutputCollector.reportError(OutputCollector.java:223) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.hdfs.bolt.HdfsBolt.execute(HdfsBolt.java:115) ~[stormjar.jar:?]
	at backtype.storm.daemon.executor$fn__6099$tuple_action_fn__6101.invoke(executor.clj:670) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.daemon.executor$mk_task_receiver$fn__6022.invoke(executor.clj:426) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.disruptor$clojure_handler$reify__912.onEvent(disruptor.clj:58) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	... 6 more
Caused by: org.apache.storm.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /errors/HDFSTopology-1-1453480559/my-bolt-last-error
	at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:119) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.zookeeper.KeeperException.create(KeeperException.java:51) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.zookeeper.ZooKeeper.create(ZooKeeper.java:783) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.curator.RetryLoop.callWithRetry(RetryLoop.java:107) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:239) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:193) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_67]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_67]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_67]
	at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_67]
	at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) ~[clojure-1.6.0.jar:?]
	at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) ~[clojure-1.6.0.jar:?]
	at backtype.storm.zookeeper$create_node.invoke(zookeeper.clj:96) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.cluster$mk_distributed_cluster_state$reify__4937.set_data(cluster.clj:110) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.cluster$mk_storm_cluster_state$reify__5557.report_error(cluster.clj:537) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.daemon.executor$throttled_report_error_fn$fn__5878.invoke(executor.clj:193) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.daemon.executor$fn__6099$fn$reify__6147.reportError(executor.clj:798) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.task.OutputCollector.reportError(OutputCollector.java:223) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at org.apache.storm.hdfs.bolt.HdfsBolt.execute(HdfsBolt.java:115) ~[stormjar.jar:?]
	at backtype.storm.daemon.executor$fn__6099$tuple_action_fn__6101.invoke(executor.clj:670) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.daemon.executor$mk_task_receiver$fn__6022.invoke(executor.clj:426) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.disruptor$clojure_handler$reify__912.onEvent(disruptor.clj:58) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) ~[storm-core-0.10.0.2.3.2.0-2950.jar:0.10.0.2.3.2.0-2950]
	... 6 more
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)