You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Soumi <so...@gmail.com> on 2015/07/16 02:22:22 UTC

storm bolt stops processing after about 2500 tuples

Hello storm users,

We have a simple topology with one kafka-spout and a series of 5 bolts.

First bolt receives ~3K tuples/second from kafka-spout and emits only
~500/second. The last bolt processes very less tuples (~100 a hour) and it
sends output to kafka. The execute latencies in all bolts are very low.


The issue we are seeing:

After the last bolt processes about 2500 tuples, it stops receiving
anything. So all those tuples which are emitted by the previous bolt do not
get acked and it shows up as failed messages in kafka-spout. Now
kafka-spout starts replaying these failed tuples and it keeps failing every
time because the last bolt is not processing anything at all.


If we restart the topology, it starts processing normally till the last
bolt again hits around 2500 tuples. This has happened 2-3 times and every
time the number of messages after it stop is around 2500.


In all the bolts we are acking all tuples (catching throwable everywhere
and logging it and acking the tuple). We don’t see any failures in any of
the bolts.

Has anyone ever seen anything similar? What could be causing this last bolt
to not receive any tuples at all after a certain number of messages?


Thanks,

Soumi

Re: storm bolt stops processing after about 2500 tuples

Posted by Soumi <so...@gmail.com>.
Hi Harsha,

Its not that the last bolt is slow. After it processes around 2500 tuples,
it does not receive any more. The 2nd last bolt is still emitting, but the
last bolt is not getting any more tuples. Question is what happens to the
tuples emitted by the 2nd last bolt? Why does those not reach the last one?
And why does that happen after around 2500 tuples?





On Wed, Jul 15, 2015 at 10:00 PM Harsha <st...@harsha.io> wrote:

>  soumi,
>        if your downstream bolt doesn't ack before tuple timeout ( by
> default its 30 secs)  storm will consider it as failed tuple and kafka
> spout will replay those. Since your last bolt is slower in acking may be
> you shouldn't anchor the tuple to the last bolt .
>
> -harsha
> On Wed, Jul 15, 2015, at 05:22 PM, Soumi wrote:
>
> Hello storm users,
>
> We have a simple topology with one kafka-spout and a series of 5 bolts.
>
> First bolt receives ~3K tuples/second from kafka-spout and emits only
> ~500/second. The last bolt processes very less tuples (~100 a hour) and it
> sends output to kafka. The execute latencies in all bolts are very low.
>
>
> The issue we are seeing:
>
> After the last bolt processes about 2500 tuples, it stops receiving
> anything. So all those tuples which are emitted by the previous bolt do not
> get acked and it shows up as failed messages in kafka-spout. Now
> kafka-spout starts replaying these failed tuples and it keeps failing every
> time because the last bolt is not processing anything at all.
>
>
> If we restart the topology, it starts processing normally till the last
> bolt again hits around 2500 tuples. This has happened 2-3 times and every
> time the number of messages after it stop is around 2500.
>
>
> In all the bolts we are acking all tuples (catching throwable everywhere
> and logging it and acking the tuple). We don’t see any failures in any of
> the bolts.
>
> Has anyone ever seen anything similar? What could be causing this last
> bolt to not receive any tuples at all after a certain number of messages?
>
>
> Thanks,
>
> Soumi
>
>
>

Re: storm bolt stops processing after about 2500 tuples

Posted by Harsha <st...@harsha.io>.
soumi,       if your downstream bolt doesn't ack before tuple timeout (
by default its 30 secs)  storm will consider it as failed tuple and
kafka spout will replay those. Since your last bolt is slower in acking
may be you shouldn't anchor the tuple to the last bolt .

-harsha On Wed, Jul 15, 2015, at 05:22 PM, Soumi wrote:
> Hello storm users,


> We have a simple topology with one kafka-spout and a series of
> 5 bolts.


> First bolt receives ~3K tuples/second from kafka-spout and emits only
> ~500/second. The last bolt processes very less tuples (~100 a hour)
> and it sends output to kafka. The execute latencies in all bolts are
> very low.


>


> The issue we are seeing:


> After the last bolt processes about 2500 tuples, it stops receiving
> anything. So all those tuples which are emitted by the previous
> bolt do not get acked and it shows up as failed messages in kafka-
> spout. Now kafka-spout starts replaying these failed tuples and it
> keeps failing every time because the last bolt is not processing
> anything at all.


>


> If we restart the topology, it starts processing normally till the
> last bolt again hits around 2500 tuples. This has happened 2-3 times
> and every time the number of messages after it stop is around 2500.


>


> In all the bolts we are acking all tuples (catching throwable
> everywhere and logging it and acking the tuple). We don’t see any
> failures in any of the bolts.


> Has anyone ever seen anything similar? What could be causing this last
> bolt to not receive any tuples at all after a certain number of
> messages?


>


> Thanks,


> Soumi