You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Ethan Li (JIRA)" <ji...@apache.org> on 2017/08/02 16:32:01 UTC

[jira] [Updated] (STORM-2674) NoNodeException when ZooKeeper tries to delete nodes

     [ https://issues.apache.org/jira/browse/STORM-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Li updated STORM-2674:
----------------------------
    Description: 
When [StormClusterStateImpl reportError function|https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/cluster/StormClusterStateImpl.java#L650-L660] is called, it will get all the children of 
{code:java}
/storm/errors/<topo-id>/count/
{code}
 and delete some znodes to keep latest 10 errors. NoNodeException could happen when any znode is already deleted by other executors.

{code:java}
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.storm.utils.Utils$2.run(Utils.java:345) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:489) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:455) at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:98) at org.apache.storm.utils.Utils$2.run(Utils.java:335) ... 1 more Caused by: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.storm.utils.Utils.wrapInRuntime(Utils.java:413) at org.apache.storm.zookeeper.ClientZookeeper.deleteNode(ClientZookeeper.java:165) at org.apache.storm.cluster.ZKStateStorage.delete_node(ZKStateStorage.java:139) at org.apache.storm.cluster.StormClusterStateImpl.reportError(StormClusterStateImpl.java:655) at org.apache.storm.executor.error.ReportError.report(ReportError.java:69) at org.apache.storm.executor.bolt.BoltOutputCollectorImpl.reportError(BoltOutputCollectorImpl.java:154) at org.apache.storm.task.OutputCollector.reportError(OutputCollector.java:234) at org.apache.storm.topology.BasicOutputCollector.reportError(BasicOutputCollector.java:70) at org.apache.storm.starter.FastWordCountTopology$WordCount.execute(FastWordCountTopology.java:113) at org.apache.storm.topology.BasicBoltExecutor.execute(BasicBoltExecutor.java:50) at org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:125) at org.apache.storm.executor.Executor.onEvent(Executor.java:255) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:476) ... 4 more Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:250) at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:244) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109) at org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:241) at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:225) at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:35) at org.apache.storm.zookeeper.ClientZookeeper.deleteNode(ClientZookeeper.java:158) ... 15 more
{code}


  was:
When [StormClusterStateImpl reportError function|https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/cluster/StormClusterStateImpl.java#L652-L660] is called, it will get all the children of 
{code:java}
/storm/errors/<topo-id>/count/
{code}
 and delete some znodes to keep latest 10 errors. NoNodeException could happen when any znode is already deleted by other executors.

{code:java}
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.storm.utils.Utils$2.run(Utils.java:345) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:489) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:455) at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:98) at org.apache.storm.utils.Utils$2.run(Utils.java:335) ... 1 more Caused by: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.storm.utils.Utils.wrapInRuntime(Utils.java:413) at org.apache.storm.zookeeper.ClientZookeeper.deleteNode(ClientZookeeper.java:165) at org.apache.storm.cluster.ZKStateStorage.delete_node(ZKStateStorage.java:139) at org.apache.storm.cluster.StormClusterStateImpl.reportError(StormClusterStateImpl.java:655) at org.apache.storm.executor.error.ReportError.report(ReportError.java:69) at org.apache.storm.executor.bolt.BoltOutputCollectorImpl.reportError(BoltOutputCollectorImpl.java:154) at org.apache.storm.task.OutputCollector.reportError(OutputCollector.java:234) at org.apache.storm.topology.BasicOutputCollector.reportError(BasicOutputCollector.java:70) at org.apache.storm.starter.FastWordCountTopology$WordCount.execute(FastWordCountTopology.java:113) at org.apache.storm.topology.BasicBoltExecutor.execute(BasicBoltExecutor.java:50) at org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:125) at org.apache.storm.executor.Executor.onEvent(Executor.java:255) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:476) ... 4 more Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:250) at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:244) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109) at org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:241) at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:225) at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:35) at org.apache.storm.zookeeper.ClientZookeeper.deleteNode(ClientZookeeper.java:158) ... 15 more
{code}



> NoNodeException when ZooKeeper tries to delete nodes
> ----------------------------------------------------
>
>                 Key: STORM-2674
>                 URL: https://issues.apache.org/jira/browse/STORM-2674
>             Project: Apache Storm
>          Issue Type: Bug
>            Reporter: Ethan Li
>
> When [StormClusterStateImpl reportError function|https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/cluster/StormClusterStateImpl.java#L650-L660] is called, it will get all the children of 
> {code:java}
> /storm/errors/<topo-id>/count/
> {code}
>  and delete some znodes to keep latest 10 errors. NoNodeException could happen when any znode is already deleted by other executors.
> {code:java}
> java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.storm.utils.Utils$2.run(Utils.java:345) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:489) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:455) at org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:98) at org.apache.storm.utils.Utils$2.run(Utils.java:335) ... 1 more Caused by: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.storm.utils.Utils.wrapInRuntime(Utils.java:413) at org.apache.storm.zookeeper.ClientZookeeper.deleteNode(ClientZookeeper.java:165) at org.apache.storm.cluster.ZKStateStorage.delete_node(ZKStateStorage.java:139) at org.apache.storm.cluster.StormClusterStateImpl.reportError(StormClusterStateImpl.java:655) at org.apache.storm.executor.error.ReportError.report(ReportError.java:69) at org.apache.storm.executor.bolt.BoltOutputCollectorImpl.reportError(BoltOutputCollectorImpl.java:154) at org.apache.storm.task.OutputCollector.reportError(OutputCollector.java:234) at org.apache.storm.topology.BasicOutputCollector.reportError(BasicOutputCollector.java:70) at org.apache.storm.starter.FastWordCountTopology$WordCount.execute(FastWordCountTopology.java:113) at org.apache.storm.topology.BasicBoltExecutor.execute(BasicBoltExecutor.java:50) at org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:125) at org.apache.storm.executor.Executor.onEvent(Executor.java:255) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:476) ... 4 more Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /errors/fastwc-halferrors-1-1501689263/count/e0000000562 at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:250) at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:244) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109) at org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:241) at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:225) at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:35) at org.apache.storm.zookeeper.ClientZookeeper.deleteNode(ClientZookeeper.java:158) ... 15 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)