You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Margus Roo <ma...@roo.ee> on 2015/01/19 17:38:24 UTC

Where to find reason why message fails in topology

Hi

I have simple topology Kafka consumer (Spout) and HBase writer (bolt)

Sometimes some messages getting failed status. Most of time topology 
works well but I'd like to know what is exact reason why message fails. 
As much I have read there is timeout (default 30s) when message fails. 
But this is not enough. I need to know where is bottleneck to improve.
Timeouts appears behind spout. As much I understand Spout gets ack in 
case message is fully processed by topology.
So the question is where to dig ?

-- 
Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 480

Re: Where to find reason why message fails in topology

Posted by Kosala Dissanayake <um...@gmail.com>.

You seem to have a problem with bolt 2.

Complete latency is the time that it takes for a tuple to be completely
processed by the topology (you can see this description if you hover for a
moment on the heading on Storm UI)

Try increasing the parallelism of the bolt and see if that helps with the
problem. Though I would suspect that you have some kind of memory leak /
increased memory usage on this bolt as the stream goes on which is why it
slows down only after some time has passed.

Also maybe check the utilization (esp. CPU) on the spout worker machine to
see if it is being starved in resources resulting in it taking time to
process the acks received.

On Wed, Jan 21, 2015 at 4:16 AM, Margus Roo <ma...@roo.ee> wrote:

>  when I start topology then it looks nice - after 10m running
>
> http://margus.roo.ee/wp-content/uploads/2015/01/Screenshot-2015-01-20-19.15.00.png
>
> Margus (margusja) Roohttp://margus.roo.ee
> skype: margusja
> +372 51 480
>
> On 20/01/15 19:04, Margus Roo wrote:
>
> Hi
>
> Here we can see that there are some failed messages behind spout. And
> Capacity behind bolt2 is near to 1.
>
> http://margus.roo.ee/wp-content/uploads/2015/01/Screenshot-2015-01-20-18.54.08.png
>
> there are four task are writing messages to hbase.
>
> http://margus.roo.ee/wp-content/uploads/2015/01/Screenshot-2015-01-20-18.54.31.png
>
> Another question. In the first picture I can see quite big number under
> Complete latency behind Spout.
>
> But numbers behind bolts are quite small. Where that complete latency
> comes from?
>
>
> Margus (margusja) Roohttp://margus.roo.ee
> skype: margusja
> +372 51 480
>
> On 20/01/15 01:06, Kosala Dissanayake wrote:
>
>  Hi Margus,
>
>  See which bolts have high 'Capacity' values in the Storm UI, and whether
> any of those are close to / above 1 to get a clue about where the
> bottleneck might be.
>
>
>
> On Tue, Jan 20, 2015 at 3:38 AM, Margus Roo <ma...@roo.ee> wrote:
>
>> Hi
>>
>> I have simple topology Kafka consumer (Spout) and HBase writer (bolt)
>>
>> Sometimes some messages getting failed status. Most of time topology
>> works well but I'd like to know what is exact reason why message fails. As
>> much I have read there is timeout (default 30s) when message fails. But
>> this is not enough. I need to know where is bottleneck to improve.
>> Timeouts appears behind spout. As much I understand Spout gets ack in
>> case message is fully processed by topology.
>> So the question is where to dig ?
>>
>> --
>> Margus (margusja) Roo
>> http://margus.roo.ee
>> skype: margusja
>> +372 51 480
>>
>>
>
>
>

Re: Where to find reason why message fails in topology

Posted by Margus Roo <ma...@roo.ee>.

when I start topology then it looks nice - after 10m running
http://margus.roo.ee/wp-content/uploads/2015/01/Screenshot-2015-01-20-19.15.00.png

Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 480

On 20/01/15 19:04, Margus Roo wrote:
> Hi
>
> Here we can see that there are some failed messages behind spout. And 
> Capacity behind bolt2 is near to 1.
> http://margus.roo.ee/wp-content/uploads/2015/01/Screenshot-2015-01-20-18.54.08.png
>
> there are four task are writing messages to hbase.
> http://margus.roo.ee/wp-content/uploads/2015/01/Screenshot-2015-01-20-18.54.31.png
>
> Another question. In the first picture I can see quite big number 
> under Complete latency behind Spout.
>
> But numbers behind bolts are quite small. Where that complete latency 
> comes from?
>
>
> Margus (margusja) Roo
> http://margus.roo.ee
> skype: margusja
> +372 51 480
> On 20/01/15 01:06, Kosala Dissanayake wrote:
>> Hi Margus,
>>
>> See which bolts have high 'Capacity' values in the Storm UI, and 
>> whether any of those are close to / above 1 to get a clue about where 
>> the bottleneck might be.
>>
>>
>>
>> On Tue, Jan 20, 2015 at 3:38 AM, Margus Roo <margus@roo.ee 
>> <ma...@roo.ee>> wrote:
>>
>>     Hi
>>
>>     I have simple topology Kafka consumer (Spout) and HBase writer (bolt)
>>
>>     Sometimes some messages getting failed status. Most of time
>>     topology works well but I'd like to know what is exact reason why
>>     message fails. As much I have read there is timeout (default 30s)
>>     when message fails. But this is not enough. I need to know where
>>     is bottleneck to improve.
>>     Timeouts appears behind spout. As much I understand Spout gets
>>     ack in case message is fully processed by topology.
>>     So the question is where to dig ?
>>
>>     -- 
>>     Margus (margusja) Roo
>>     http://margus.roo.ee
>>     skype: margusja
>>     +372 51 480
>>
>>
>

Re: Where to find reason why message fails in topology

Posted by Margus Roo <ma...@roo.ee>.

Hi

Here we can see that there are some failed messages behind spout. And 
Capacity behind bolt2 is near to 1.
http://margus.roo.ee/wp-content/uploads/2015/01/Screenshot-2015-01-20-18.54.08.png

there are four task are writing messages to hbase.
http://margus.roo.ee/wp-content/uploads/2015/01/Screenshot-2015-01-20-18.54.31.png

Another question. In the first picture I can see quite big number under 
Complete latency behind Spout.

But numbers behind bolts are quite small. Where that complete latency 
comes from?

Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 480

On 20/01/15 01:06, Kosala Dissanayake wrote:
> Hi Margus,
>
> See which bolts have high 'Capacity' values in the Storm UI, and 
> whether any of those are close to / above 1 to get a clue about where 
> the bottleneck might be.
>
>
>
> On Tue, Jan 20, 2015 at 3:38 AM, Margus Roo <margus@roo.ee 
> <ma...@roo.ee>> wrote:
>
>     Hi
>
>     I have simple topology Kafka consumer (Spout) and HBase writer (bolt)
>
>     Sometimes some messages getting failed status. Most of time
>     topology works well but I'd like to know what is exact reason why
>     message fails. As much I have read there is timeout (default 30s)
>     when message fails. But this is not enough. I need to know where
>     is bottleneck to improve.
>     Timeouts appears behind spout. As much I understand Spout gets ack
>     in case message is fully processed by topology.
>     So the question is where to dig ?
>
>     -- 
>     Margus (margusja) Roo
>     http://margus.roo.ee
>     skype: margusja
>     +372 51 480
>
>

Re: Where to find reason why message fails in topology

Posted by Kosala Dissanayake <um...@gmail.com>.

Hi Margus,

See which bolts have high 'Capacity' values in the Storm UI, and whether
any of those are close to / above 1 to get a clue about where the
bottleneck might be.



On Tue, Jan 20, 2015 at 3:38 AM, Margus Roo <ma...@roo.ee> wrote:

> Hi
>
> I have simple topology Kafka consumer (Spout) and HBase writer (bolt)
>
> Sometimes some messages getting failed status. Most of time topology works
> well but I'd like to know what is exact reason why message fails. As much I
> have read there is timeout (default 30s) when message fails. But this is
> not enough. I need to know where is bottleneck to improve.
> Timeouts appears behind spout. As much I understand Spout gets ack in case
> message is fully processed by topology.
> So the question is where to dig ?
>
> --
> Margus (margusja) Roo
> http://margus.roo.ee
> skype: margusja
> +372 51 480
>
>