You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "James Xu (JIRA)" <ji...@apache.org> on 2013/12/14 08:35:07 UTC
[jira] [Created] (STORM-80) NPE caused by TridentBoltExecutor
reusing TrackedBatches between batch groups
James Xu created STORM-80:
-----------------------------
Summary: NPE caused by TridentBoltExecutor reusing TrackedBatches between batch groups
Key: STORM-80
URL: https://issues.apache.org/jira/browse/STORM-80
Project: Apache Storm (Incubating)
Issue Type: Bug
Reporter: James Xu
https://github.com/nathanmarz/storm/issues/421
I'm seeing intermittent errors caused by SubtopologyBolt.execute being called with a BatchInfo whose ProcessorContext is set up for a different Batch Group. In particular I'm seeing null pointer exceptions from PartitionPersistProcessor because its state fields were never set up correctly.
The best I can tell the id key (IBatchID) being used for the _batches map in TridentBoltExecutor is not unique between batch groups. As a result the tracked batch will have been initialized for a different Batch Group and set of processors.
I hoped to be able to track down the source of this issue but can't determine where the BatchIDs are being added to the tuples.
If it matters, my topology has two streams each reading from their own OpaqueTransactionalKafka spout w/different topics.
Backtrace:
65108 [Thread-25] ERROR backtype.storm.daemon.executor -
java.lang.RuntimeException: java.lang.NullPointerException
at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87) ~[storm-0.9.0-wip4.jar:na]
at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58) ~[storm-0.9.0-wip4.jar:na]
at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62) ~[storm-0.9.0-wip4.jar:na]
at backtype.storm.daemon.executor$fn__3551$fn__3563$fn__3610.invoke(executor.clj:712) ~[storm-0.9.0-wip4.jar:na]
at backtype.storm.util$async_loop$fn__436.invoke(util.clj:377) ~[storm-0.9.0-wip4.jar:na]
at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
at java.lang.Thread.run(Thread.java:722) [na:1.7.0_09]
Caused by: java.lang.NullPointerException: null
at storm.trident.planner.processor.PartitionPersistProcessor.execute(PartitionPersistProcessor.java:59) ~[storm-0.9.0-wip4.jar:na]
at storm.trident.planner.SubtopologyBolt$InitialReceiver.receive(SubtopologyBolt.java:189) ~[storm-0.9.0-wip4.jar:na]
at storm.trident.planner.SubtopologyBolt.execute(SubtopologyBolt.java:129) ~[storm-0.9.0-wip4.jar:na]
at storm.trident.topology.TridentBoltExecutor.execute(TridentBoltExecutor.java:352) ~[storm-0.9.0-wip4.jar:na]
at backtype.storm.daemon.executor$fn__3551$tuple_action_fn__3553.invoke(executor.clj:607) ~[storm-0.9.0-wip4.jar:na]
at backtype.storm.daemon.executor$mk_task_receiver$fn__3474.invoke(executor.clj:379) ~[storm-0.9.0-wip4.jar:na]
at backtype.storm.disruptor$clojure_handler$reify__3011.onEvent(disruptor.clj:43) ~[storm-0.9.0-wip4.jar:na]
at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84) ~[storm-0.9.0-wip4.jar:na]
... 6 common frames omitted
Also, I'm only seeing this in LocalCluster mode, not in production.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)