You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Rick Kellogg (JIRA)" <ji...@apache.org> on 2015/10/09 02:08:26 UTC

[jira] [Updated] (STORM-80) NPE caused by TridentBoltExecutor reusing TrackedBatches between batch groups

     [ https://issues.apache.org/jira/browse/STORM-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Kellogg updated STORM-80:
------------------------------
    Component/s: storm-core

> NPE caused by TridentBoltExecutor reusing TrackedBatches between batch groups
> -----------------------------------------------------------------------------
>
>                 Key: STORM-80
>                 URL: https://issues.apache.org/jira/browse/STORM-80
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>            Reporter: James Xu
>
> https://github.com/nathanmarz/storm/issues/421
> I'm seeing intermittent errors caused by SubtopologyBolt.execute being called with a BatchInfo whose ProcessorContext is set up for a different Batch Group. In particular I'm seeing null pointer exceptions from PartitionPersistProcessor because its state fields were never set up correctly.
> The best I can tell the id key (IBatchID) being used for the _batches map in TridentBoltExecutor is not unique between batch groups. As a result the tracked batch will have been initialized for a different Batch Group and set of processors.
> I hoped to be able to track down the source of this issue but can't determine where the BatchIDs are being added to the tuples.
> If it matters, my topology has two streams each reading from their own OpaqueTransactionalKafka spout w/different topics.
> Backtrace:
> 65108 [Thread-25] ERROR backtype.storm.daemon.executor - 
> java.lang.RuntimeException: java.lang.NullPointerException
>         at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87) ~[storm-0.9.0-wip4.jar:na]
>         at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58) ~[storm-0.9.0-wip4.jar:na]
>         at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62) ~[storm-0.9.0-wip4.jar:na]
>         at backtype.storm.daemon.executor$fn__3551$fn__3563$fn__3610.invoke(executor.clj:712) ~[storm-0.9.0-wip4.jar:na]
>         at backtype.storm.util$async_loop$fn__436.invoke(util.clj:377) ~[storm-0.9.0-wip4.jar:na]
>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
>         at java.lang.Thread.run(Thread.java:722) [na:1.7.0_09]
> Caused by: java.lang.NullPointerException: null
>         at storm.trident.planner.processor.PartitionPersistProcessor.execute(PartitionPersistProcessor.java:59) ~[storm-0.9.0-wip4.jar:na]
>         at storm.trident.planner.SubtopologyBolt$InitialReceiver.receive(SubtopologyBolt.java:189) ~[storm-0.9.0-wip4.jar:na]
>         at storm.trident.planner.SubtopologyBolt.execute(SubtopologyBolt.java:129) ~[storm-0.9.0-wip4.jar:na]
>         at storm.trident.topology.TridentBoltExecutor.execute(TridentBoltExecutor.java:352) ~[storm-0.9.0-wip4.jar:na]
>         at backtype.storm.daemon.executor$fn__3551$tuple_action_fn__3553.invoke(executor.clj:607) ~[storm-0.9.0-wip4.jar:na]
>         at backtype.storm.daemon.executor$mk_task_receiver$fn__3474.invoke(executor.clj:379) ~[storm-0.9.0-wip4.jar:na]
>         at backtype.storm.disruptor$clojure_handler$reify__3011.onEvent(disruptor.clj:43) ~[storm-0.9.0-wip4.jar:na]
>         at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84) ~[storm-0.9.0-wip4.jar:na]
>         ... 6 common frames omitted
> Also, I'm only seeing this in LocalCluster mode, not in production.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)