You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Kutlu Araslı <ku...@gmail.com> on 2015/05/12 13:22:48 UTC

Storm metrics under heavy load

Hi everyone,

Our topology consumes tuples from a Kestrel MQ and runs a series of bolts
to process items including some db connections. Storm version is 0.8.3 and
supervisors are run on VMs.
When number of tuples increases in queue, we observe that, a single tuple
execution time also rise  dramatically in paralel which ends up with a
throttle behaviour.
In the meantime CPU and memory usage looks comfortable.From database point,
we have not observed a problem so far under stress.
Is there any configuration trick or an advice for handling such a load?
There is already a limit on MAX_SPOUT_PENDING as 32.

Thanks,

Re: Storm metrics under heavy load

Posted by Nathan Leung <nc...@gmail.com>.
To elaborate I would consider something like 50 or 60 to avoid tuple
timeouts, unless you have increased that above the default.
On May 12, 2015 9:53 AM, "Nathan Leung" <nc...@gmail.com> wrote:

> If your tuples take that long 500 may be too high (depends on your
> parallelism) but if you are seeing under utilization then 32 is probably
> too low. You can try different settings to find what works best for your
> application.
> On May 12, 2015 9:31 AM, "Kutlu Araslı" <ku...@gmail.com> wrote:
>
>> There are 4 supervisor machines in cluster and 4 spout are tasks are
>> running . A usual tuple takes like 0,5 second to process and duration rises
>> above 15 seconds when queue is full.
>> I will try to increase the number to 500 as you have suggested.
>>
>> Thanks,
>>
>> 12 May 2015 Sal, 16:05 tarihinde, Nathan Leung <nc...@gmail.com> şunu
>> yazdı:
>>
>>> When the spout output queue is big then you will see the total
>>> processing time increase because it includes the time the tuple spends in
>>> the queue. How many spout tasks do you have? Your max pending seems low,
>>> and when it's too low your cluster will be starved for data. Try increasing
>>> it to a few hundred or one thousand.
>>> On May 12, 2015 7:23 AM, "Kutlu Araslı" <ku...@gmail.com> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> Our topology consumes tuples from a Kestrel MQ and runs a series of
>>>> bolts to process items including some db connections. Storm version is
>>>> 0.8.3 and supervisors are run on VMs.
>>>> When number of tuples increases in queue, we observe that, a single
>>>> tuple execution time also rise  dramatically in paralel which ends up with
>>>> a throttle behaviour.
>>>> In the meantime CPU and memory usage looks comfortable.From database
>>>> point, we have not observed a problem so far under stress.
>>>> Is there any configuration trick or an advice for handling such a load?
>>>> There is already a limit on MAX_SPOUT_PENDING as 32.
>>>>
>>>> Thanks,
>>>>
>>>>
>>>>

Re: Storm metrics under heavy load

Posted by Nathan Leung <nc...@gmail.com>.
If your tuples take that long 500 may be too high (depends on your
parallelism) but if you are seeing under utilization then 32 is probably
too low. You can try different settings to find what works best for your
application.
On May 12, 2015 9:31 AM, "Kutlu Araslı" <ku...@gmail.com> wrote:

> There are 4 supervisor machines in cluster and 4 spout are tasks are
> running . A usual tuple takes like 0,5 second to process and duration rises
> above 15 seconds when queue is full.
> I will try to increase the number to 500 as you have suggested.
>
> Thanks,
>
> 12 May 2015 Sal, 16:05 tarihinde, Nathan Leung <nc...@gmail.com> şunu
> yazdı:
>
>> When the spout output queue is big then you will see the total processing
>> time increase because it includes the time the tuple spends in the queue.
>> How many spout tasks do you have? Your max pending seems low, and when it's
>> too low your cluster will be starved for data. Try increasing it to a few
>> hundred or one thousand.
>> On May 12, 2015 7:23 AM, "Kutlu Araslı" <ku...@gmail.com> wrote:
>>
>>> Hi everyone,
>>>
>>> Our topology consumes tuples from a Kestrel MQ and runs a series of
>>> bolts to process items including some db connections. Storm version is
>>> 0.8.3 and supervisors are run on VMs.
>>> When number of tuples increases in queue, we observe that, a single
>>> tuple execution time also rise  dramatically in paralel which ends up with
>>> a throttle behaviour.
>>> In the meantime CPU and memory usage looks comfortable.From database
>>> point, we have not observed a problem so far under stress.
>>> Is there any configuration trick or an advice for handling such a load?
>>> There is already a limit on MAX_SPOUT_PENDING as 32.
>>>
>>> Thanks,
>>>
>>>
>>>

Re: Storm metrics under heavy load

Posted by Kutlu Araslı <ku...@gmail.com>.
There are 4 supervisor machines in cluster and 4 spout are tasks are
running . A usual tuple takes like 0,5 second to process and duration rises
above 15 seconds when queue is full.
I will try to increase the number to 500 as you have suggested.

Thanks,

12 May 2015 Sal, 16:05 tarihinde, Nathan Leung <nc...@gmail.com> şunu
yazdı:

> When the spout output queue is big then you will see the total processing
> time increase because it includes the time the tuple spends in the queue.
> How many spout tasks do you have? Your max pending seems low, and when it's
> too low your cluster will be starved for data. Try increasing it to a few
> hundred or one thousand.
> On May 12, 2015 7:23 AM, "Kutlu Araslı" <ku...@gmail.com> wrote:
>
>> Hi everyone,
>>
>> Our topology consumes tuples from a Kestrel MQ and runs a series of bolts
>> to process items including some db connections. Storm version is 0.8.3 and
>> supervisors are run on VMs.
>> When number of tuples increases in queue, we observe that, a single tuple
>> execution time also rise  dramatically in paralel which ends up with a
>> throttle behaviour.
>> In the meantime CPU and memory usage looks comfortable.From database
>> point, we have not observed a problem so far under stress.
>> Is there any configuration trick or an advice for handling such a load?
>> There is already a limit on MAX_SPOUT_PENDING as 32.
>>
>> Thanks,
>>
>>
>>

Re: Storm metrics under heavy load

Posted by Nathan Leung <nc...@gmail.com>.
When the spout output queue is big then you will see the total processing
time increase because it includes the time the tuple spends in the queue.
How many spout tasks do you have? Your max pending seems low, and when it's
too low your cluster will be starved for data. Try increasing it to a few
hundred or one thousand.
On May 12, 2015 7:23 AM, "Kutlu Araslı" <ku...@gmail.com> wrote:

> Hi everyone,
>
> Our topology consumes tuples from a Kestrel MQ and runs a series of bolts
> to process items including some db connections. Storm version is 0.8.3 and
> supervisors are run on VMs.
> When number of tuples increases in queue, we observe that, a single tuple
> execution time also rise  dramatically in paralel which ends up with a
> throttle behaviour.
> In the meantime CPU and memory usage looks comfortable.From database
> point, we have not observed a problem so far under stress.
> Is there any configuration trick or an advice for handling such a load?
> There is already a limit on MAX_SPOUT_PENDING as 32.
>
> Thanks,
>
>
>

Re: Storm metrics under heavy load

Posted by Kutlu Araslı <ku...@gmail.com>.
Thanks, that was really helpful.
As far as, i understand from all:
Number of tuples which are emited from MAX_SPOUT_PENDING are buffered in
spout and clock begins to count for complete latency. As complete latecy
increases for pending tuples in the spout, storm starts to replay tuples
which throttles cluster becuase of processing same items. So i will first
try to decrease MAX_SPOUT_PENDING in expense of throughput and observe
sutiation. Adding CPU's and increasing MAX_SPOUT_PENDING will be my next
shot.




12 May 2015 Sal, 22:44 tarihinde, Jeffery Maass <ma...@gmail.com> şunu
yazdı:

> Ok, I see now.
>
> So, everytime that Storm asks your spout for another tuple - your spout
> doesn't necessarily emit one.  Which means that your topology is
> necessarily not being "maxed out".  Or maybe better said, you are not
> experiencing topology behavior when MAX_SPOUT_PENDING has been reached and
> therefore used to limit the number of records processing within the
> topology.
>
> When you are seeing large numbers of tuples in Kestrel MQ, your spout is
> more likely being limited by MAX_SPOUT_PENDING.
>
> When you look at your bolts and spouts within the Storm UI, what number do
> you see for capacity?  The number will vary from 0 to 1.  The closer the
> number to 1, the fewer additional in process tuples you can expect to add
> to the topology and expect results.
>
> Note that there are 3 spout level Latencies :
> * per spout - complete latency milliseconds
> * per bolt  - process latency milliseconds
> * per bolt - execution latency milliseconds
>
> Complete Latency - how long does it take a tuple to flow all the way
> through the topology and back to the spout
> Process Latency - how long does it take a tuple to flow through the worker
> Execution latency - how long does it take a tuple to flow through a bolt's
> execute method
>
> Complete latency, therefore, is made up of the process latency and
> execution latency of every bolt in the topology, plus latency due to
> something else....I myself thing of this as the missing latency or system's
> latency.
>
> I've noticed that as you increase the number of in process tuples ( via
> MAX_SPOUT_PENDING ), that the complete latency increases much quicker than
> the execution and process latency of individual bolts.  In fact, what I
> have seen is that at a certain point of increasing in process tuples, the
> records processed per millisecond begins to drop.  An this appears to be
> solely related to the missing aka system latency.
>
> It sounds to me like what you are experiencing is this very thing.  I
> think that the solution is to add bolt instances, which then may lead you
> to adding cpu's.
>
>
> Thank you for your time!
>
> +++++++++++++++++++++
> Jeff Maass <ma...@gmail.com>
> linkedin.com/in/jeffmaass
> stackoverflow.com/users/373418/maassql
> +++++++++++++++++++++
>
>
> On Tue, May 12, 2015 at 9:08 AM, Kutlu Araslı <ku...@gmail.com> wrote:
>
>> I meant our tuple queues in Kestrel MQ which spout consumes.
>>
>>
>> 12 May 2015 Sal, 17:00 tarihinde, Jeffery Maass <ma...@gmail.com> şunu
>> yazdı:
>>
>> To what number / metric are you referring when you say, "When number of
>>> tuples increases in queue"?  What you are describing sounds like the
>>> beginning of queue explosion.  If so, increasing max spout pending will
>>> make the situation worse.
>>>
>>> Thank you for your time!
>>>
>>> +++++++++++++++++++++
>>> Jeff Maass <ma...@gmail.com>
>>> linkedin.com/in/jeffmaass
>>> stackoverflow.com/users/373418/maassql
>>> +++++++++++++++++++++
>>>
>>>
>>> On Tue, May 12, 2015 at 6:22 AM, Kutlu Araslı <ku...@gmail.com> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> Our topology consumes tuples from a Kestrel MQ and runs a series of
>>>> bolts to process items including some db connections. Storm version is
>>>> 0.8.3 and supervisors are run on VMs.
>>>> When number of tuples increases in queue, we observe that, a single
>>>> tuple execution time also rise  dramatically in paralel which ends up with
>>>> a throttle behaviour.
>>>> In the meantime CPU and memory usage looks comfortable.From database
>>>> point, we have not observed a problem so far under stress.
>>>> Is there any configuration trick or an advice for handling such a load?
>>>> There is already a limit on MAX_SPOUT_PENDING as 32.
>>>>
>>>> Thanks,
>>>>
>>>>
>>>>
>>>
>

Re: Storm metrics under heavy load

Posted by Jeffery Maass <ma...@gmail.com>.
Ok, I see now.

So, everytime that Storm asks your spout for another tuple - your spout
doesn't necessarily emit one.  Which means that your topology is
necessarily not being "maxed out".  Or maybe better said, you are not
experiencing topology behavior when MAX_SPOUT_PENDING has been reached and
therefore used to limit the number of records processing within the
topology.

When you are seeing large numbers of tuples in Kestrel MQ, your spout is
more likely being limited by MAX_SPOUT_PENDING.

When you look at your bolts and spouts within the Storm UI, what number do
you see for capacity?  The number will vary from 0 to 1.  The closer the
number to 1, the fewer additional in process tuples you can expect to add
to the topology and expect results.

Note that there are 3 spout level Latencies :
* per spout - complete latency milliseconds
* per bolt  - process latency milliseconds
* per bolt - execution latency milliseconds

Complete Latency - how long does it take a tuple to flow all the way
through the topology and back to the spout
Process Latency - how long does it take a tuple to flow through the worker
Execution latency - how long does it take a tuple to flow through a bolt's
execute method

Complete latency, therefore, is made up of the process latency and
execution latency of every bolt in the topology, plus latency due to
something else....I myself thing of this as the missing latency or system's
latency.

I've noticed that as you increase the number of in process tuples ( via
MAX_SPOUT_PENDING ), that the complete latency increases much quicker than
the execution and process latency of individual bolts.  In fact, what I
have seen is that at a certain point of increasing in process tuples, the
records processed per millisecond begins to drop.  An this appears to be
solely related to the missing aka system latency.

It sounds to me like what you are experiencing is this very thing.  I think
that the solution is to add bolt instances, which then may lead you to
adding cpu's.


Thank you for your time!

+++++++++++++++++++++
Jeff Maass <ma...@gmail.com>
linkedin.com/in/jeffmaass
stackoverflow.com/users/373418/maassql
+++++++++++++++++++++


On Tue, May 12, 2015 at 9:08 AM, Kutlu Araslı <ku...@gmail.com> wrote:

> I meant our tuple queues in Kestrel MQ which spout consumes.
>
>
> 12 May 2015 Sal, 17:00 tarihinde, Jeffery Maass <ma...@gmail.com> şunu
> yazdı:
>
> To what number / metric are you referring when you say, "When number of
>> tuples increases in queue"?  What you are describing sounds like the
>> beginning of queue explosion.  If so, increasing max spout pending will
>> make the situation worse.
>>
>> Thank you for your time!
>>
>> +++++++++++++++++++++
>> Jeff Maass <ma...@gmail.com>
>> linkedin.com/in/jeffmaass
>> stackoverflow.com/users/373418/maassql
>> +++++++++++++++++++++
>>
>>
>> On Tue, May 12, 2015 at 6:22 AM, Kutlu Araslı <ku...@gmail.com> wrote:
>>
>>> Hi everyone,
>>>
>>> Our topology consumes tuples from a Kestrel MQ and runs a series of
>>> bolts to process items including some db connections. Storm version is
>>> 0.8.3 and supervisors are run on VMs.
>>> When number of tuples increases in queue, we observe that, a single
>>> tuple execution time also rise  dramatically in paralel which ends up with
>>> a throttle behaviour.
>>> In the meantime CPU and memory usage looks comfortable.From database
>>> point, we have not observed a problem so far under stress.
>>> Is there any configuration trick or an advice for handling such a load?
>>> There is already a limit on MAX_SPOUT_PENDING as 32.
>>>
>>> Thanks,
>>>
>>>
>>>
>>

Re: Storm metrics under heavy load

Posted by Kutlu Araslı <ku...@gmail.com>.
I meant our tuple queues in Kestrel MQ which spout consumes.


12 May 2015 Sal, 17:00 tarihinde, Jeffery Maass <ma...@gmail.com> şunu
yazdı:

> To what number / metric are you referring when you say, "When number of
> tuples increases in queue"?  What you are describing sounds like the
> beginning of queue explosion.  If so, increasing max spout pending will
> make the situation worse.
>
> Thank you for your time!
>
> +++++++++++++++++++++
> Jeff Maass <ma...@gmail.com>
> linkedin.com/in/jeffmaass
> stackoverflow.com/users/373418/maassql
> +++++++++++++++++++++
>
>
> On Tue, May 12, 2015 at 6:22 AM, Kutlu Araslı <ku...@gmail.com> wrote:
>
>> Hi everyone,
>>
>> Our topology consumes tuples from a Kestrel MQ and runs a series of bolts
>> to process items including some db connections. Storm version is 0.8.3 and
>> supervisors are run on VMs.
>> When number of tuples increases in queue, we observe that, a single tuple
>> execution time also rise  dramatically in paralel which ends up with a
>> throttle behaviour.
>> In the meantime CPU and memory usage looks comfortable.From database
>> point, we have not observed a problem so far under stress.
>> Is there any configuration trick or an advice for handling such a load?
>> There is already a limit on MAX_SPOUT_PENDING as 32.
>>
>> Thanks,
>>
>>
>>
>

Re: Storm metrics under heavy load

Posted by Jeffery Maass <ma...@gmail.com>.
To what number / metric are you referring when you say, "When number of
tuples increases in queue"?  What you are describing sounds like the
beginning of queue explosion.  If so, increasing max spout pending will
make the situation worse.

Thank you for your time!

+++++++++++++++++++++
Jeff Maass <ma...@gmail.com>
linkedin.com/in/jeffmaass
stackoverflow.com/users/373418/maassql
+++++++++++++++++++++


On Tue, May 12, 2015 at 6:22 AM, Kutlu Araslı <ku...@gmail.com> wrote:

> Hi everyone,
>
> Our topology consumes tuples from a Kestrel MQ and runs a series of bolts
> to process items including some db connections. Storm version is 0.8.3 and
> supervisors are run on VMs.
> When number of tuples increases in queue, we observe that, a single tuple
> execution time also rise  dramatically in paralel which ends up with a
> throttle behaviour.
> In the meantime CPU and memory usage looks comfortable.From database
> point, we have not observed a problem so far under stress.
> Is there any configuration trick or an advice for handling such a load?
> There is already a limit on MAX_SPOUT_PENDING as 32.
>
> Thanks,
>
>
>