You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Sam Mati <sm...@appnexus.com> on 2014/10/22 00:16:45 UTC

What happens when a tuple times out?

Hi all.  First time playing around with Storm.

I've set up a dummy topology to see how timeouts, ack, and fail work.  I have a spout that emits random words (it chooses a word that has previously failed, and if there are none, a random word). I have a bolt that, depending on Random.nextBoolean(), either acknowledges the tuple, or sleeps for 5 seconds.  I've set a timeout to 10 seconds, and max pending to 5.  Thus, it is very likely that tuples will timeout.

I'm seeing that when tuples timeout, the Spout gets notified via fail() being called — however my Bolt still ends up executing those tuples long after the timeout.  Additionally, if the Bolt acks the tuple, there is no effect.  Eventually, my Bolt ends up exclusively processing already-timed-out tuples, and everything fails.

I'm assuming that each Bolt/Worker/Task/Executor (I'm not sure which) has a queue, and that timeouts do not clear tuples from these queues.

Is there a way to prevent already-failed tuples from being processed by the Bolt?

Thanks,
-Sam