You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Nitin Gupta <Ni...@e-zest.in> on 2016/04/06 09:34:47 UTC

Storm Topology Hanging

Dear All,

I am facing an issue where in one of my Topology hangs after 14-16 hours.

>From the Storm UI I can see the below statics:

*         Number of Records Emitted: 4000

*         Number of Records Acked : 3180

*         Number of Records Failed : 1140


*         The Spout Latency is 10913 milliseconds

*         I have configured the Max Spout Config to the Number of  Executors : 40 in my case

*         I have only one Spout

>From various blogs on this topic , I understand if the value of Max Spout Config is not set properly it can result in such issues.
Can someone please guide me what should the correct value in my case.

Thanks & Regards,
Nitin Gupta



Re: Storm Topology Hanging

Posted by jinhong lu <lu...@gmail.com>.
Try to check the log of kafka at the moment topology began to hung, and the load of kafka cluster at that moment.  I met a situation is that the kafka is too busy to response, it cause storm topology hung, especially when you use transactional topology.

lujinhong

> 在 2016年4月7日,13:58,Nitin Gupta <Ni...@e-zest.in> 写道:
> 
> found


RE: Storm Topology Hanging

Posted by Nitin Gupta <Ni...@e-zest.in>.
Hi Alex,

Thanks for the guidance!

Is there any formula to determine what should be  the ideal Max Spout Pending based on the number of tuples per second and the latency time? I read some blogs where the same problem occurred because the max spout pending was too high.

The worker log doesn’t show any messages on why the KafkaSpout has stopped picking the messages from the queue . Is there any other place when we can get more information on this.

Thanks & Regards,
Nitin Gupta

From: Alexander T [mailto:mittspamkonto@gmail.com]
Sent: Thursday, April 07, 2016 1:13 PM
To: user@storm.apache.org
Subject: RE: Storm Topology Hanging


Hi Nitin,

You might get problems with max spout pending if you delay acking of one message until another arrives. In this case having a low max spout pending can cause the buffers to fill before the awaited message arrives. But this type of deadlock should cause a message timeout. If you have disabled message timeout, however, you could get a deadlock without symptoms.

Another possible problem is auth failure. I don't know how the Kafka spout works, but the Kinesis spout which we use is completely silent if it fails to connect. Very annoying.

Best regards,
Alexander
On Apr 7, 2016 7:58 AM, "Nitin Gupta" <Ni...@e-zest.in>> wrote:
I checked the worker log files and found there is no specific exception. It looks the KafkaSpout just stops picking the records from kafka queue. Any pointers?

From: Nitin Gupta [mailto:Nitin.Gupta@e-zest.in<ma...@e-zest.in>]
Sent: Wednesday, April 06, 2016 1:30 PM
To: user@storm.apache.org<ma...@storm.apache.org>
Subject: RE: Storm Topology Hanging

Thanks for the response!


•         The Storm version is 0.9.4

•         I have set a message timeout to 300 seconds.

•         Yes I also think the latency is too high. The Bolt is doing too much of work there. It is basically requesting documents for a User from multiple sources(Google Drive, Box,DropBox etc). I can split the work into 3 topologies. But wanted to know is latency the root cause for the hang.

I will check if the worker stacktrace can be shared.

Thanks !

Nitin Gupta

From: Jungtaek Lim [mailto:kabhwan@gmail.com]
Sent: Wednesday, April 06, 2016 1:15 PM
To: user@storm.apache.org<ma...@storm.apache.org>
Subject: Re: Storm Topology Hanging

Hi Nitin,

Though it decreases throughput when max spout pending value is small, it should not hang.
Which version of Storm do you use, and could you share stack trace of workers if you don't mind?

Btw, your spout latency is a bit high (10s). Which value you set message timeout secs?

Thanks,
Jungtaek Lim (HeartSaVioR)


2016년 4월 6일 (수) 오후 4:34, Nitin Gupta <Ni...@e-zest.in>>님이 작성:
Dear All,

I am facing an issue where in one of my Topology hangs after 14-16 hours.

From the Storm UI I can see the below statics:

•         Number of Records Emitted: 4000

•         Number of Records Acked : 3180

•         Number of Records Failed : 1140


•         The Spout Latency is 10913 milliseconds

•         I have configured the Max Spout Config to the Number of  Executors : 40 in my case

•         I have only one Spout

From various blogs on this topic , I understand if the value of Max Spout Config is not set properly it can result in such issues.
Can someone please guide me what should the correct value in my case.

Thanks & Regards,
Nitin Gupta



RE: Storm Topology Hanging

Posted by Alexander T <mi...@gmail.com>.
Hi Nitin,

You might get problems with max spout pending if you delay acking of one
message until another arrives. In this case having a low max spout pending
can cause the buffers to fill before the awaited message arrives. But this
type of deadlock should cause a message timeout. If you have disabled
message timeout, however, you could get a deadlock without symptoms.

Another possible problem is auth failure. I don't know how the Kafka spout
works, but the Kinesis spout which we use is completely silent if it fails
to connect. Very annoying.

Best regards,
Alexander
On Apr 7, 2016 7:58 AM, "Nitin Gupta" <Ni...@e-zest.in> wrote:

> I checked the worker log files and found there is no specific exception.
> It looks the KafkaSpout just stops picking the records from kafka queue.
> Any pointers?
>
>
>
> *From:* Nitin Gupta [mailto:Nitin.Gupta@e-zest.in]
> *Sent:* Wednesday, April 06, 2016 1:30 PM
> *To:* user@storm.apache.org
> *Subject:* RE: Storm Topology Hanging
>
>
>
> Thanks for the response!
>
>
>
> ·         The Storm version is 0.9.4
>
> ·         I have set a message timeout to 300 seconds.
>
> ·         Yes I also think the latency is too high. The Bolt is doing too
> much of work there. It is basically requesting documents for a User from
> multiple sources(Google Drive, Box,DropBox etc). I can split the work into
> 3 topologies. But wanted to know is latency the root cause for the hang.
>
>
>
> I will check if the worker stacktrace can be shared.
>
>
>
> Thanks !
>
>
>
> Nitin Gupta
>
>
>
> *From:* Jungtaek Lim [mailto:kabhwan@gmail.com <ka...@gmail.com>]
> *Sent:* Wednesday, April 06, 2016 1:15 PM
> *To:* user@storm.apache.org
> *Subject:* Re: Storm Topology Hanging
>
>
>
> Hi Nitin,
>
>
>
> Though it decreases throughput when max spout pending value is small, it
> should not hang.
>
> Which version of Storm do you use, and could you share stack trace of
> workers if you don't mind?
>
>
>
> Btw, your spout latency is a bit high (10s). Which value you set message
> timeout secs?
>
>
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
>
>
>
>
> 2016년 4월 6일 (수) 오후 4:34, Nitin Gupta <Ni...@e-zest.in>님이 작성:
>
> Dear All,
>
>
>
> I am facing an issue where in one of my Topology hangs after 14-16 hours.
>
>
>
> From the Storm UI I can see the below statics:
>
> ·         Number of Records Emitted: 4000
>
> ·         Number of Records Acked : 3180
>
> ·         Number of Records Failed : 1140
>
>
>
> ·         The Spout Latency is 10913 milliseconds
>
> ·         I have configured the Max Spout Config to the Number of
>  Executors : 40 in my case
>
> ·         I have only one Spout
>
>
>
> From various blogs on this topic , I understand if the value of Max Spout
> Config is not set properly it can result in such issues.
>
> Can someone please guide me what should the correct value in my case.
>
>
>
> Thanks & Regards,
>
> Nitin Gupta
>
>
>
>
>
>

RE: Storm Topology Hanging

Posted by Nitin Gupta <Ni...@e-zest.in>.
I checked the worker log files and found there is no specific exception. It looks the KafkaSpout just stops picking the records from kafka queue. Any pointers?

From: Nitin Gupta [mailto:Nitin.Gupta@e-zest.in]
Sent: Wednesday, April 06, 2016 1:30 PM
To: user@storm.apache.org
Subject: RE: Storm Topology Hanging

Thanks for the response!


·         The Storm version is 0.9.4

·         I have set a message timeout to 300 seconds.

·         Yes I also think the latency is too high. The Bolt is doing too much of work there. It is basically requesting documents for a User from multiple sources(Google Drive, Box,DropBox etc). I can split the work into 3 topologies. But wanted to know is latency the root cause for the hang.

I will check if the worker stacktrace can be shared.

Thanks !

Nitin Gupta

From: Jungtaek Lim [mailto:kabhwan@gmail.com]
Sent: Wednesday, April 06, 2016 1:15 PM
To: user@storm.apache.org<ma...@storm.apache.org>
Subject: Re: Storm Topology Hanging

Hi Nitin,

Though it decreases throughput when max spout pending value is small, it should not hang.
Which version of Storm do you use, and could you share stack trace of workers if you don't mind?

Btw, your spout latency is a bit high (10s). Which value you set message timeout secs?

Thanks,
Jungtaek Lim (HeartSaVioR)


2016년 4월 6일 (수) 오후 4:34, Nitin Gupta <Ni...@e-zest.in>>님이 작성:
Dear All,

I am facing an issue where in one of my Topology hangs after 14-16 hours.

From the Storm UI I can see the below statics:

•         Number of Records Emitted: 4000

•         Number of Records Acked : 3180

•         Number of Records Failed : 1140


•         The Spout Latency is 10913 milliseconds

•         I have configured the Max Spout Config to the Number of  Executors : 40 in my case

•         I have only one Spout

From various blogs on this topic , I understand if the value of Max Spout Config is not set properly it can result in such issues.
Can someone please guide me what should the correct value in my case.

Thanks & Regards,
Nitin Gupta



RE: Storm Topology Hanging

Posted by Nitin Gupta <Ni...@e-zest.in>.
Thanks for the response!


·         The Storm version is 0.9.4

·         I have set a message timeout to 300 seconds.

·         Yes I also think the latency is too high. The Bolt is doing too much of work there. It is basically requesting documents for a User from multiple sources(Google Drive, Box,DropBox etc). I can split the work into 3 topologies. But wanted to know is latency the root cause for the hang.

I will check if the worker stacktrace can be shared.

Thanks !

Nitin Gupta

From: Jungtaek Lim [mailto:kabhwan@gmail.com]
Sent: Wednesday, April 06, 2016 1:15 PM
To: user@storm.apache.org
Subject: Re: Storm Topology Hanging

Hi Nitin,

Though it decreases throughput when max spout pending value is small, it should not hang.
Which version of Storm do you use, and could you share stack trace of workers if you don't mind?

Btw, your spout latency is a bit high (10s). Which value you set message timeout secs?

Thanks,
Jungtaek Lim (HeartSaVioR)


2016년 4월 6일 (수) 오후 4:34, Nitin Gupta <Ni...@e-zest.in>>님이 작성:
Dear All,

I am facing an issue where in one of my Topology hangs after 14-16 hours.

From the Storm UI I can see the below statics:

•         Number of Records Emitted: 4000

•         Number of Records Acked : 3180

•         Number of Records Failed : 1140


•         The Spout Latency is 10913 milliseconds

•         I have configured the Max Spout Config to the Number of  Executors : 40 in my case

•         I have only one Spout

From various blogs on this topic , I understand if the value of Max Spout Config is not set properly it can result in such issues.
Can someone please guide me what should the correct value in my case.

Thanks & Regards,
Nitin Gupta



Re: Storm Topology Hanging

Posted by Jungtaek Lim <ka...@gmail.com>.
Hi Nitin,

Though it decreases throughput when max spout pending value is small, it
should not hang.
Which version of Storm do you use, and could you share stack trace of
workers if you don't mind?

Btw, your spout latency is a bit high (10s). Which value you set message
timeout secs?

Thanks,
Jungtaek Lim (HeartSaVioR)


2016년 4월 6일 (수) 오후 4:34, Nitin Gupta <Ni...@e-zest.in>님이 작성:

> Dear All,
>
>
>
> I am facing an issue where in one of my Topology hangs after 14-16 hours.
>
>
>
> From the Storm UI I can see the below statics:
>
> ·         Number of Records Emitted: 4000
>
> ·         Number of Records Acked : 3180
>
> ·         Number of Records Failed : 1140
>
>
>
> ·         The Spout Latency is 10913 milliseconds
>
> ·         I have configured the Max Spout Config to the Number of
>  Executors : 40 in my case
>
> ·         I have only one Spout
>
>
>
> From various blogs on this topic , I understand if the value of Max Spout
> Config is not set properly it can result in such issues.
>
> Can someone please guide me what should the correct value in my case.
>
>
>
> Thanks & Regards,
>
> Nitin Gupta
>
>
>
>
>