You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@storm.apache.org by Walid Aljoby <wa...@yahoo.com.INVALID> on 2016/11/23 12:26:58 UTC

Storm sending rate

Hi everyone,
Could anyone has an experience to explain the factors affecting sending rate in Storm?

Thank you--RegardsWA

Re: Storm sending rate

Posted by Navin Ipe <na...@searchlighthealth.com>.

1. It's just that I noticed that for my topologies, when the latency
reaches 3 seconds, there is a greater chance of tuples failing even if they
are reemitted. It may be specific only to my topology because of the number
of tuples being reemitted. You might have a different experience. It's upto
you to experiment.
2. The default is 4 slots. You can increase it
<https://www.google.co.in/search?client=ubuntu&channel=fs&q=storm+how+to+increase+the+number+of+slots+per+node&ie=utf-8&oe=utf-8&gfe_rd=cr&ei=kPg6WKylO-Xx8Afg3Jb4Ag>.
See the "Storm's slots" section on this page
<http://nrecursions.blogspot.in/2016/09/concepts-about-storm-you-need-to-know.html>
.
3. Entirely upto you about how you implement it. I just gave you a design
idea.

On Fri, Nov 25, 2016 at 8:45 PM, Walid Aljoby <wa...@yahoo.com>
wrote:

> Hi Navin,
>
> Thank you for deep details. Interesting!
> I have some questions, I was wondering if you could inline the answers,
> please.
>
> - What is the intuition for the average latency/ tuple to be within 2.5
> seconds?
> - Is the number of slots per node limited to 4?
> - Acutally, I use WordCount Topology which consists of some sentence
> stored inside the spout function, and it is possible to use multiple
> instance of spout. However,   I could not realize why reading the tuples
> from the queues is better for fast emitting?
>
>
> Sorry for long questions.
>
> --
> Best Regards
> WA
>
> ------------------------------
> *From:* Navin Ipe <na...@searchlighthealth.com>
> *To:* Walid Aljoby <wa...@yahoo.com>
> *Cc:* "user@storm.apache.org" <us...@storm.apache.org>
> *Sent:* Thursday, November 24, 2016 3:09 PM
> *Subject:* Re: Storm sending rate
>
> Ideally, each time nextTuple is called, you should be emitting only one
> tuple. Of course, you can emit more than one, but then it would be better
> to monitor the latency and emit only as many tuples which can be ack'ed
> within a latency of 2.5 second.
> Make sure you have enough of workers
> Increase TOPOLOGY_MESSAGE_TIMEOUT_SECS
> Increase stormConfig.setNumWorkers(someNumber); and
> stormConfig.setNumAckers(someNumber);
> Each storm node will have 4 slots which can handle 4 workers, so create as
> many workers as you have slots. Slots = number of nodes * 4. If you have
> more workers than slots, then Storm will have to handle more than one
> worker on a single slot, which will be a little slower.
> Having number of workers = number of tasks (number of spouts and bolts) is
> also helpful to avoid lags.
>
> If you really want to increase the number of emits phenomenally, then use
> a separate program to put objects into a queue like RabbitMQ or any of the
> other queue programs available. Then, create multiple spout instances which
> will read from this queue and emit. This way, you'll have multiple spouts
> emitting tuples, and you can have multiple bolts which take tuples from
> these spouts and process the data.
>
>
>
>
> On Thu, Nov 24, 2016 at 11:02 AM, Walid Aljoby <wa...@yahoo.com>
> wrote:
>
> Hi Navin,
>
> Yes, I meant by the sending rate; the outgoing tuples from the spout, as
> the Representative for data source, to the computation bolts.
> The question about tuning the respective parameters for increasing the
> spout emitting tuples. Actually, I tried different values for max spout
> pending, but not much improvement in the application throughput. Hence, I
> asked if other parameters affect the speed of emitting tuples.
>
> Thank you and Regards,
> --
> WA
>
>
> ------------------------------
> *From:* Navin Ipe <navin.ipe@searchlighthealth. com
> <na...@searchlighthealth.com>>
> *To:* user@storm.apache.org; Walid Aljoby <wa...@yahoo.com>
> *Sent:* Thursday, November 24, 2016 12:54 PM
> *Subject:* Re: Storm sending rate
>
> Please remember that we cannot read your mind. A little more elaboration
> on what problem you are facing and what you mean by "sending rate" would
> help.
>
> On Wed, Nov 23, 2016 at 5:56 PM, Walid Aljoby <wa...@yahoo.com>
> wrote:
>
> Hi everyone,
>
> Could anyone has an experience to explain the factors affecting sending
> rate in Storm?
>
>
> Thank you
> --
> Regards
> WA
>
>
>
>
> --
> Regards,
> Navin
>
>
>
>
>
> --
> Regards,
> Navin
>
>
>


-- 
Regards,
Navin

Re: Storm sending rate

Posted by Walid Aljoby <wa...@yahoo.com>.

Hi Navin,
Thank you for deep details. Interesting!I have some questions, I was wondering if you could inline the answers, please.
- What is the intuition for the average latency/ tuple to be within 2.5 seconds?- Is the number of slots per node limited to 4?- Acutally, I use WordCount Topology which consists of some sentence stored inside the spout function, and it is possible to use multiple instance of spout. However,   I could not realize why reading the tuples from the queues is better for fast emitting?

Sorry for long questions.
--Best RegardsWA

      From: Navin Ipe <na...@searchlighthealth.com>
 To: Walid Aljoby <wa...@yahoo.com> 
Cc: "user@storm.apache.org" <us...@storm.apache.org>
 Sent: Thursday, November 24, 2016 3:09 PM
 Subject: Re: Storm sending rate

Ideally, each time nextTuple is called, you should be emitting only one tuple. Of course, you can emit more than one, but then it would be better to monitor the latency and emit only as many tuples which can be ack'ed within a latency of 2.5 second.
Make sure you have enough of workers
Increase TOPOLOGY_MESSAGE_TIMEOUT_SECS
Increase stormConfig.setNumWorkers(someNumber); and stormConfig.setNumAckers(someNumber);
Each storm node will have 4 slots which can handle 4 workers, so create as many workers as you have slots. Slots = number of nodes * 4. If you have more workers than slots, then Storm will have to handle more than one worker on a single slot, which will be a little slower.
Having number of workers = number of tasks (number of spouts and bolts) is also helpful to avoid lags.

If you really want to increase the number of emits phenomenally, then use a separate program to put objects into a queue like RabbitMQ or any of the other queue programs available. Then, create multiple spout instances which will read from this queue and emit. This way, you'll have multiple spouts emitting tuples, and you can have multiple bolts which take tuples from these spouts and process the data.

On Thu, Nov 24, 2016 at 11:02 AM, Walid Aljoby <wa...@yahoo.com> wrote:

Hi Navin,
Yes, I meant by the sending rate; the outgoing tuples from the spout, as the Representative for data source, to the computation bolts. The question about tuning the respective parameters for increasing the spout emitting tuples. Actually, I tried different values for max spout pending, but not much improvement in the application throughput. Hence, I asked if other parameters affect the speed of emitting tuples. 
Thank you and Regards,--WA

      From: Navin Ipe <navin.ipe@searchlighthealth. com>
 To: user@storm.apache.org; Walid Aljoby <wa...@yahoo.com> 
 Sent: Thursday, November 24, 2016 12:54 PM
 Subject: Re: Storm sending rate

Please remember that we cannot read your mind. A little more elaboration on what problem you are facing and what you mean by "sending rate" would help.

On Wed, Nov 23, 2016 at 5:56 PM, Walid Aljoby <wa...@yahoo.com> wrote:

Hi everyone,
Could anyone has an experience to explain the factors affecting sending rate in Storm?

Thank you--RegardsWA

-- 
Regards,Navin

-- 
Regards,Navin

Re: Storm sending rate

Posted by Navin Ipe <na...@searchlighthealth.com>.

Ideally, each time nextTuple is called, you should be emitting only one
tuple. Of course, you can emit more than one, but then it would be better
to monitor the latency and emit only as many tuples which can be ack'ed
within a latency of 2.5 second.
Make sure you have enough of workers
Increase TOPOLOGY_MESSAGE_TIMEOUT_SECS
Increase stormConfig.setNumWorkers(someNumber); and
stormConfig.setNumAckers(someNumber);
Each storm node will have 4 slots which can handle 4 workers, so create as
many workers as you have slots. Slots = number of nodes * 4. If you have
more workers than slots, then Storm will have to handle more than one
worker on a single slot, which will be a little slower.
Having number of workers = number of tasks (number of spouts and bolts) is
also helpful to avoid lags.

If you really want to increase the number of emits phenomenally, then use a
separate program to put objects into a queue like RabbitMQ or any of the
other queue programs available. Then, create multiple spout instances which
will read from this queue and emit. This way, you'll have multiple spouts
emitting tuples, and you can have multiple bolts which take tuples from
these spouts and process the data.

On Thu, Nov 24, 2016 at 11:02 AM, Walid Aljoby <wa...@yahoo.com>
wrote:

> Hi Navin,
>
> Yes, I meant by the sending rate; the outgoing tuples from the spout, as
> the Representative for data source, to the computation bolts.
> The question about tuning the respective parameters for increasing the
> spout emitting tuples. Actually, I tried different values for max spout
> pending, but not much improvement in the application throughput. Hence, I
> asked if other parameters affect the speed of emitting tuples.
>
> Thank you and Regards,
> --
> WA
>
>
> ------------------------------
> *From:* Navin Ipe <na...@searchlighthealth.com>
> *To:* user@storm.apache.org; Walid Aljoby <wa...@yahoo.com>
> *Sent:* Thursday, November 24, 2016 12:54 PM
> *Subject:* Re: Storm sending rate
>
> Please remember that we cannot read your mind. A little more elaboration
> on what problem you are facing and what you mean by "sending rate" would
> help.
>
> On Wed, Nov 23, 2016 at 5:56 PM, Walid Aljoby <wa...@yahoo.com>
> wrote:
>
> Hi everyone,
>
> Could anyone has an experience to explain the factors affecting sending
> rate in Storm?
>
>
> Thank you
> --
> Regards
> WA
>
>
>
>
> --
> Regards,
> Navin
>
>
>

-- 
Regards,
Navin

Re: Storm sending rate

Posted by Walid Aljoby <wa...@yahoo.com>.

Hi Navin,
Yes, I meant by the sending rate; the outgoing tuples from the spout, as the Representative for data source, to the computation bolts. The question about tuning the respective parameters for increasing the spout emitting tuples. Actually, I tried different values for max spout pending, but not much improvement in the application throughput. Hence, I asked if other parameters affect the speed of emitting tuples. 
Thank you and Regards,--WA

      From: Navin Ipe <na...@searchlighthealth.com>
 To: user@storm.apache.org; Walid Aljoby <wa...@yahoo.com> 
 Sent: Thursday, November 24, 2016 12:54 PM
 Subject: Re: Storm sending rate

Please remember that we cannot read your mind. A little more elaboration on what problem you are facing and what you mean by "sending rate" would help.

On Wed, Nov 23, 2016 at 5:56 PM, Walid Aljoby <wa...@yahoo.com> wrote:

Hi everyone,
Could anyone has an experience to explain the factors affecting sending rate in Storm?

Thank you--RegardsWA

-- 
Regards,Navin

Re: Storm sending rate

Posted by Navin Ipe <na...@searchlighthealth.com>.

Please remember that we cannot read your mind. A little more elaboration on
what problem you are facing and what you mean by "sending rate" would help.

On Wed, Nov 23, 2016 at 5:56 PM, Walid Aljoby <wa...@yahoo.com>
wrote:

> Hi everyone,
>
> Could anyone has an experience to explain the factors affecting sending
> rate in Storm?
>
>
> Thank you
> --
> Regards
> WA
>

-- 
Regards,
Navin