You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Fang Chen <fc...@gmail.com> on 2015/04/07 01:31:44 UTC

how is storm complete latency spent?

Hi experts,

I did a simple experiment to understand how to tune storm topology for
production, but was totally puzzled by the results on complete latency (avg
time a tuple tree takes  to finish).

I used a simple bolt that does nothing but sleep for a period of time then
ack the input tuple that comes from a kafka spout. For simplicity, I
limited worker count to 1 (acker # as well as spout or bolt count are all 1
too), to make sure spout, bolt and acker all sits in the same worker.

Firstly I tuned the sleep time to 0 (no sleep), and max spout pending # to
be 300, then the complete latency is around 10ms. All seems good and
reasonable so far. After I changed the pending # to 1, however, the latency
suddenly spiked to 900ms.

I then modified the sleep time to 1ms, with pending # at 300, the latency
also spiked to around 600ms. while when pending # is 1, the latency hovered
over 900ms

It all seemed like Storm internally does some tuple batching and waiting
based on both size and time when moving tuples from one queue to the other.
Is this expected? Or is it related to disruptor queue handling?

Where should I look at if I want to customize these behaviors? I am not
familiar with clojure yet, so any pointer is much appreciated!

Thanks a lot,
Fang