You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Ahmed El Rheddane <ah...@imag.fr> on 2014/10/23 10:52:39 UTC

Generating a constant load of tuples

Hello again,

I am trying to generate a regulated load of tuples using a sleep in the 
spout's nextTuple(). However, the observed results show that the number 
of tuples periodically drops by about 200 hundred tuples (from 800 to 600).

I thought that this was due to the way I was monitoring Storm, but the 
behvior has been confirmed by local spout logs.
I haven't set any limit on pending messages and I'm not using any ackers 
anyways.

Any help would be very much appreciated.

Cheers,
Ahmed

Re: Generating a constant load of tuples

Posted by Ahmed El Rheddane <ah...@imag.fr>.

Hello,

The issue was indeed related to the size of VMs I was using, with better 
machines, everything works as expected.

Thanks Taylor and Itai.

Ahmed

On 10/24/2014 09:50 PM, P. Taylor Goetz wrote:
> I would try with a beefier machine. It sound like you might be running into resource contention.
>
> -Taylor
>
>
> On Oct 24, 2014, at 6:08 AM, Itai Frenkel <It...@forter.com> wrote:
>
>> As far as I understand, the wait strategy holds even without any acking. Try replacing the sleep with the configuration + upgrading to stronger machine + timing the _collector.emit latency (or looking at nimbus). In addition try different throughputs and see if there is a breaking point, or this is an inherent problem.
>> ________________________________________
>> From: Ahmed El Rheddane <ah...@imag.fr>
>> Sent: Thursday, October 23, 2014 4:29 PM
>> To: user@storm.apache.org
>> Subject: Re: Generating a constant load of tuples
>>
>> I simply copied the TestWordSpout and changed the sleep period:
>> public void nextTuple() {
>>          Utils.sleep(10);
>>          final String[] words = new String[] {"nathan", "mike",
>> "jackson", "golda", "bertels"};
>>          final Random rand = new Random();
>>          final String word = words[rand.nextInt(words.length)];
>>          _collector.emit(new Values(word));
>> }
>>
>> As nextTuple() always emits a tuple and I don't use ackers, I'm not sure
>> the sleeping wait strategy is ever triggered.
>>
>> Indeed, I'll probably move to bigger sizes, but since the generated
>> throughput is rather low, I figured a small instance would be enough.
>>
>> Ahmed
>>
>> On 10/23/2014 02:41 PM, Itai Frenkel wrote:
>>> Instead of doing sleep in spout, do
>>> conf.put(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS, 10);
>>> Otherwise there is another sleep(1) hidden in storm.
>>>
>>> 1. What can you tell about the spout code. Is it a mock that just performs a for loop with emit?
>>> 2. What does nimbus report the spout latency is ?
>>>
>>> On a side note, I would do benchmarks on a small Virtual Machine. Consider scaling up.
>>>
>>>
>>> Itai
>>> ________________________________________
>>> From: Ahmed El Rheddane <ah...@imag.fr>
>>> Sent: Thursday, October 23, 2014 3:19 PM
>>> To: user@storm.apache.org
>>> Subject: Re: Generating a constant load of tuples
>>>
>>> Hi Itai,
>>>
>>> The bolt follows perfectly the pace of emission, I am using different
>>> workers for the components (each running on a seperate MS Azure small VM
>>> instance with 1 vCPU).
>>> CPU is nowhere near busy as I am sleeping for 10ms after each emission
>>> (confirmed by Ganglia monitoring).
>>> If it can help I'm using Strom 0.9.1 with Oracle Java 1.7.
>>> I also have the Zookeeper server and Nimbus on the same VM instance.
>>>
>>> Joined is the observed Spout throughput (should be a little less than
>>> 100 tuple/s at all time).
>>>
>>> Ahmed
>>>
>>> On 10/23/2014 01:19 PM, Itai Frenkel wrote:
>>>> I meant 100% CPU on any of the CPU cores?
>>>> ________________________________________
>>>> From: Itai Frenkel<It...@forter.com>
>>>> Sent: Thursday, October 23, 2014 2:11 PM
>>>> To:user@storm.apache.org
>>>> Subject: Re: Generating a constant load of tuples
>>>>
>>>> Conjecture - Could the bolt after the spout be the problem? Once the bolt input queue is full (I think the default is 1024 items), then it will pushback and Spout emit will block the spout thread.
>>>> You could measure the time it took to emit for diagnostics.
>>>>
>>>> Also- do you have 2 CPUs.
>>>>
>>>> Itai
>>>> ________________________________________
>>>> From: Ahmed El Rheddane<ah...@imag.fr>
>>>> Sent: Thursday, October 23, 2014 11:52 AM
>>>> To:user@storm.apache.org
>>>> Subject: Generating a constant load of tuples
>>>>
>>>> Hello again,
>>>>
>>>> I am trying to generate a regulated load of tuples using a sleep in the
>>>> spout's nextTuple(). However, the observed results show that the number
>>>> of tuples periodically drops by about 200 hundred tuples (from 800 to 600).
>>>>
>>>> I thought that this was due to the way I was monitoring Storm, but the
>>>> behvior has been confirmed by local spout logs.
>>>> I haven't set any limit on pending messages and I'm not using any ackers
>>>> anyways.
>>>>
>>>> Any help would be very much appreciated.
>>>>
>>>> Cheers,
>>>> Ahmed

Re: Generating a constant load of tuples

Posted by "P. Taylor Goetz" <pt...@gmail.com>.

I would try with a beefier machine. It sound like you might be running into resource contention.

-Taylor


On Oct 24, 2014, at 6:08 AM, Itai Frenkel <It...@forter.com> wrote:

> As far as I understand, the wait strategy holds even without any acking. Try replacing the sleep with the configuration + upgrading to stronger machine + timing the _collector.emit latency (or looking at nimbus). In addition try different throughputs and see if there is a breaking point, or this is an inherent problem.
> ________________________________________
> From: Ahmed El Rheddane <ah...@imag.fr>
> Sent: Thursday, October 23, 2014 4:29 PM
> To: user@storm.apache.org
> Subject: Re: Generating a constant load of tuples
> 
> I simply copied the TestWordSpout and changed the sleep period:
> public void nextTuple() {
>         Utils.sleep(10);
>         final String[] words = new String[] {"nathan", "mike",
> "jackson", "golda", "bertels"};
>         final Random rand = new Random();
>         final String word = words[rand.nextInt(words.length)];
>         _collector.emit(new Values(word));
> }
> 
> As nextTuple() always emits a tuple and I don't use ackers, I'm not sure
> the sleeping wait strategy is ever triggered.
> 
> Indeed, I'll probably move to bigger sizes, but since the generated
> throughput is rather low, I figured a small instance would be enough.
> 
> Ahmed
> 
> On 10/23/2014 02:41 PM, Itai Frenkel wrote:
>> Instead of doing sleep in spout, do
>> conf.put(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS, 10);
>> Otherwise there is another sleep(1) hidden in storm.
>> 
>> 1. What can you tell about the spout code. Is it a mock that just performs a for loop with emit?
>> 2. What does nimbus report the spout latency is ?
>> 
>> On a side note, I would do benchmarks on a small Virtual Machine. Consider scaling up.
>> 
>> 
>> Itai
>> ________________________________________
>> From: Ahmed El Rheddane <ah...@imag.fr>
>> Sent: Thursday, October 23, 2014 3:19 PM
>> To: user@storm.apache.org
>> Subject: Re: Generating a constant load of tuples
>> 
>> Hi Itai,
>> 
>> The bolt follows perfectly the pace of emission, I am using different
>> workers for the components (each running on a seperate MS Azure small VM
>> instance with 1 vCPU).
>> CPU is nowhere near busy as I am sleeping for 10ms after each emission
>> (confirmed by Ganglia monitoring).
>> If it can help I'm using Strom 0.9.1 with Oracle Java 1.7.
>> I also have the Zookeeper server and Nimbus on the same VM instance.
>> 
>> Joined is the observed Spout throughput (should be a little less than
>> 100 tuple/s at all time).
>> 
>> Ahmed
>> 
>> On 10/23/2014 01:19 PM, Itai Frenkel wrote:
>>> I meant 100% CPU on any of the CPU cores?
>>> ________________________________________
>>> From: Itai Frenkel<It...@forter.com>
>>> Sent: Thursday, October 23, 2014 2:11 PM
>>> To:user@storm.apache.org
>>> Subject: Re: Generating a constant load of tuples
>>> 
>>> Conjecture - Could the bolt after the spout be the problem? Once the bolt input queue is full (I think the default is 1024 items), then it will pushback and Spout emit will block the spout thread.
>>> You could measure the time it took to emit for diagnostics.
>>> 
>>> Also- do you have 2 CPUs.
>>> 
>>> Itai
>>> ________________________________________
>>> From: Ahmed El Rheddane<ah...@imag.fr>
>>> Sent: Thursday, October 23, 2014 11:52 AM
>>> To:user@storm.apache.org
>>> Subject: Generating a constant load of tuples
>>> 
>>> Hello again,
>>> 
>>> I am trying to generate a regulated load of tuples using a sleep in the
>>> spout's nextTuple(). However, the observed results show that the number
>>> of tuples periodically drops by about 200 hundred tuples (from 800 to 600).
>>> 
>>> I thought that this was due to the way I was monitoring Storm, but the
>>> behvior has been confirmed by local spout logs.
>>> I haven't set any limit on pending messages and I'm not using any ackers
>>> anyways.
>>> 
>>> Any help would be very much appreciated.
>>> 
>>> Cheers,
>>> Ahmed
>

Re: Generating a constant load of tuples

Posted by Itai Frenkel <It...@forter.com>.

As far as I understand, the wait strategy holds even without any acking. Try replacing the sleep with the configuration + upgrading to stronger machine + timing the _collector.emit latency (or looking at nimbus). In addition try different throughputs and see if there is a breaking point, or this is an inherent problem.
________________________________________
From: Ahmed El Rheddane <ah...@imag.fr>
Sent: Thursday, October 23, 2014 4:29 PM
To: user@storm.apache.org
Subject: Re: Generating a constant load of tuples

I simply copied the TestWordSpout and changed the sleep period:
public void nextTuple() {
         Utils.sleep(10);
         final String[] words = new String[] {"nathan", "mike",
"jackson", "golda", "bertels"};
         final Random rand = new Random();
         final String word = words[rand.nextInt(words.length)];
         _collector.emit(new Values(word));
}

As nextTuple() always emits a tuple and I don't use ackers, I'm not sure
the sleeping wait strategy is ever triggered.

Indeed, I'll probably move to bigger sizes, but since the generated
throughput is rather low, I figured a small instance would be enough.

Ahmed

On 10/23/2014 02:41 PM, Itai Frenkel wrote:
> Instead of doing sleep in spout, do
> conf.put(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS, 10);
> Otherwise there is another sleep(1) hidden in storm.
>
> 1. What can you tell about the spout code. Is it a mock that just performs a for loop with emit?
> 2. What does nimbus report the spout latency is ?
>
> On a side note, I would do benchmarks on a small Virtual Machine. Consider scaling up.
>
>
> Itai
> ________________________________________
> From: Ahmed El Rheddane <ah...@imag.fr>
> Sent: Thursday, October 23, 2014 3:19 PM
> To: user@storm.apache.org
> Subject: Re: Generating a constant load of tuples
>
> Hi Itai,
>
> The bolt follows perfectly the pace of emission, I am using different
> workers for the components (each running on a seperate MS Azure small VM
> instance with 1 vCPU).
> CPU is nowhere near busy as I am sleeping for 10ms after each emission
> (confirmed by Ganglia monitoring).
> If it can help I'm using Strom 0.9.1 with Oracle Java 1.7.
> I also have the Zookeeper server and Nimbus on the same VM instance.
>
> Joined is the observed Spout throughput (should be a little less than
> 100 tuple/s at all time).
>
> Ahmed
>
> On 10/23/2014 01:19 PM, Itai Frenkel wrote:
>> I meant 100% CPU on any of the CPU cores?
>> ________________________________________
>> From: Itai Frenkel<It...@forter.com>
>> Sent: Thursday, October 23, 2014 2:11 PM
>> To:user@storm.apache.org
>> Subject: Re: Generating a constant load of tuples
>>
>> Conjecture - Could the bolt after the spout be the problem? Once the bolt input queue is full (I think the default is 1024 items), then it will pushback and Spout emit will block the spout thread.
>> You could measure the time it took to emit for diagnostics.
>>
>> Also- do you have 2 CPUs.
>>
>> Itai
>> ________________________________________
>> From: Ahmed El Rheddane<ah...@imag.fr>
>> Sent: Thursday, October 23, 2014 11:52 AM
>> To:user@storm.apache.org
>> Subject: Generating a constant load of tuples
>>
>> Hello again,
>>
>> I am trying to generate a regulated load of tuples using a sleep in the
>> spout's nextTuple(). However, the observed results show that the number
>> of tuples periodically drops by about 200 hundred tuples (from 800 to 600).
>>
>> I thought that this was due to the way I was monitoring Storm, but the
>> behvior has been confirmed by local spout logs.
>> I haven't set any limit on pending messages and I'm not using any ackers
>> anyways.
>>
>> Any help would be very much appreciated.
>>
>> Cheers,
>> Ahmed

Re: Generating a constant load of tuples

Posted by Ahmed El Rheddane <ah...@imag.fr>.

I simply copied the TestWordSpout and changed the sleep period:
public void nextTuple() {
         Utils.sleep(10);
         final String[] words = new String[] {"nathan", "mike", 
"jackson", "golda", "bertels"};
         final Random rand = new Random();
         final String word = words[rand.nextInt(words.length)];
         _collector.emit(new Values(word));
}

As nextTuple() always emits a tuple and I don't use ackers, I'm not sure 
the sleeping wait strategy is ever triggered.

Indeed, I'll probably move to bigger sizes, but since the generated 
throughput is rather low, I figured a small instance would be enough.

Ahmed

On 10/23/2014 02:41 PM, Itai Frenkel wrote:
> Instead of doing sleep in spout, do
> conf.put(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS, 10);
> Otherwise there is another sleep(1) hidden in storm.
>
> 1. What can you tell about the spout code. Is it a mock that just performs a for loop with emit?
> 2. What does nimbus report the spout latency is ?
>
> On a side note, I would do benchmarks on a small Virtual Machine. Consider scaling up.
>
>
> Itai
> ________________________________________
> From: Ahmed El Rheddane <ah...@imag.fr>
> Sent: Thursday, October 23, 2014 3:19 PM
> To: user@storm.apache.org
> Subject: Re: Generating a constant load of tuples
>
> Hi Itai,
>
> The bolt follows perfectly the pace of emission, I am using different
> workers for the components (each running on a seperate MS Azure small VM
> instance with 1 vCPU).
> CPU is nowhere near busy as I am sleeping for 10ms after each emission
> (confirmed by Ganglia monitoring).
> If it can help I'm using Strom 0.9.1 with Oracle Java 1.7.
> I also have the Zookeeper server and Nimbus on the same VM instance.
>
> Joined is the observed Spout throughput (should be a little less than
> 100 tuple/s at all time).
>
> Ahmed
>
> On 10/23/2014 01:19 PM, Itai Frenkel wrote:
>> I meant 100% CPU on any of the CPU cores?
>> ________________________________________
>> From: Itai Frenkel<It...@forter.com>
>> Sent: Thursday, October 23, 2014 2:11 PM
>> To:user@storm.apache.org
>> Subject: Re: Generating a constant load of tuples
>>
>> Conjecture - Could the bolt after the spout be the problem? Once the bolt input queue is full (I think the default is 1024 items), then it will pushback and Spout emit will block the spout thread.
>> You could measure the time it took to emit for diagnostics.
>>
>> Also- do you have 2 CPUs.
>>
>> Itai
>> ________________________________________
>> From: Ahmed El Rheddane<ah...@imag.fr>
>> Sent: Thursday, October 23, 2014 11:52 AM
>> To:user@storm.apache.org
>> Subject: Generating a constant load of tuples
>>
>> Hello again,
>>
>> I am trying to generate a regulated load of tuples using a sleep in the
>> spout's nextTuple(). However, the observed results show that the number
>> of tuples periodically drops by about 200 hundred tuples (from 800 to 600).
>>
>> I thought that this was due to the way I was monitoring Storm, but the
>> behvior has been confirmed by local spout logs.
>> I haven't set any limit on pending messages and I'm not using any ackers
>> anyways.
>>
>> Any help would be very much appreciated.
>>
>> Cheers,
>> Ahmed

Re: Generating a constant load of tuples

Posted by Itai Frenkel <It...@forter.com>.

Instead of doing sleep in spout, do
conf.put(Config.TOPOLOGY_SLEEP_SPOUT_WAIT_STRATEGY_TIME_MS, 10); 
Otherwise there is another sleep(1) hidden in storm.

1. What can you tell about the spout code. Is it a mock that just performs a for loop with emit?
2. What does nimbus report the spout latency is ?

On a side note, I would do benchmarks on a small Virtual Machine. Consider scaling up.

Itai
________________________________________
From: Ahmed El Rheddane <ah...@imag.fr>
Sent: Thursday, October 23, 2014 3:19 PM
To: user@storm.apache.org
Subject: Re: Generating a constant load of tuples

Hi Itai,

The bolt follows perfectly the pace of emission, I am using different
workers for the components (each running on a seperate MS Azure small VM
instance with 1 vCPU).
CPU is nowhere near busy as I am sleeping for 10ms after each emission
(confirmed by Ganglia monitoring).
If it can help I'm using Strom 0.9.1 with Oracle Java 1.7.
I also have the Zookeeper server and Nimbus on the same VM instance.

Joined is the observed Spout throughput (should be a little less than
100 tuple/s at all time).

Ahmed

On 10/23/2014 01:19 PM, Itai Frenkel wrote:
> I meant 100% CPU on any of the CPU cores?
> ________________________________________
> From: Itai Frenkel<It...@forter.com>
> Sent: Thursday, October 23, 2014 2:11 PM
> To:user@storm.apache.org
> Subject: Re: Generating a constant load of tuples
>
> Conjecture - Could the bolt after the spout be the problem? Once the bolt input queue is full (I think the default is 1024 items), then it will pushback and Spout emit will block the spout thread.
> You could measure the time it took to emit for diagnostics.
>
> Also- do you have 2 CPUs.
>
> Itai
> ________________________________________
> From: Ahmed El Rheddane<ah...@imag.fr>
> Sent: Thursday, October 23, 2014 11:52 AM
> To:user@storm.apache.org
> Subject: Generating a constant load of tuples
>
> Hello again,
>
> I am trying to generate a regulated load of tuples using a sleep in the
> spout's nextTuple(). However, the observed results show that the number
> of tuples periodically drops by about 200 hundred tuples (from 800 to 600).
>
> I thought that this was due to the way I was monitoring Storm, but the
> behvior has been confirmed by local spout logs.
> I haven't set any limit on pending messages and I'm not using any ackers
> anyways.
>
> Any help would be very much appreciated.
>
> Cheers,
> Ahmed

Re: Generating a constant load of tuples

Posted by Ahmed El Rheddane <ah...@imag.fr>.

Hi Itai,

The bolt follows perfectly the pace of emission, I am using different 
workers for the components (each running on a seperate MS Azure small VM 
instance with 1 vCPU).
CPU is nowhere near busy as I am sleeping for 10ms after each emission 
(confirmed by Ganglia monitoring).
If it can help I'm using Strom 0.9.1 with Oracle Java 1.7.
I also have the Zookeeper server and Nimbus on the same VM instance.

Joined is the observed Spout throughput (should be a little less than 
100 tuple/s at all time).

Ahmed

On 10/23/2014 01:19 PM, Itai Frenkel wrote:
> I meant 100% CPU on any of the CPU cores?
> ________________________________________
> From: Itai Frenkel<It...@forter.com>
> Sent: Thursday, October 23, 2014 2:11 PM
> To:user@storm.apache.org
> Subject: Re: Generating a constant load of tuples
>
> Conjecture - Could the bolt after the spout be the problem? Once the bolt input queue is full (I think the default is 1024 items), then it will pushback and Spout emit will block the spout thread.
> You could measure the time it took to emit for diagnostics.
>
> Also- do you have 2 CPUs.
>
> Itai
> ________________________________________
> From: Ahmed El Rheddane<ah...@imag.fr>
> Sent: Thursday, October 23, 2014 11:52 AM
> To:user@storm.apache.org
> Subject: Generating a constant load of tuples
>
> Hello again,
>
> I am trying to generate a regulated load of tuples using a sleep in the
> spout's nextTuple(). However, the observed results show that the number
> of tuples periodically drops by about 200 hundred tuples (from 800 to 600).
>
> I thought that this was due to the way I was monitoring Storm, but the
> behvior has been confirmed by local spout logs.
> I haven't set any limit on pending messages and I'm not using any ackers
> anyways.
>
> Any help would be very much appreciated.
>
> Cheers,
> Ahmed

Re: Generating a constant load of tuples

Posted by Itai Frenkel <It...@forter.com>.

I meant 100% CPU on any of the CPU cores?
________________________________________
From: Itai Frenkel <It...@forter.com>
Sent: Thursday, October 23, 2014 2:11 PM
To: user@storm.apache.org
Subject: Re: Generating a constant load of tuples

Conjecture - Could the bolt after the spout be the problem? Once the bolt input queue is full (I think the default is 1024 items), then it will pushback and Spout emit will block the spout thread.
You could measure the time it took to emit for diagnostics.

Also- do you have 2 CPUs.

Itai
________________________________________
From: Ahmed El Rheddane <ah...@imag.fr>
Sent: Thursday, October 23, 2014 11:52 AM
To: user@storm.apache.org
Subject: Generating a constant load of tuples

Hello again,

I am trying to generate a regulated load of tuples using a sleep in the
spout's nextTuple(). However, the observed results show that the number
of tuples periodically drops by about 200 hundred tuples (from 800 to 600).

I thought that this was due to the way I was monitoring Storm, but the
behvior has been confirmed by local spout logs.
I haven't set any limit on pending messages and I'm not using any ackers
anyways.

Any help would be very much appreciated.

Cheers,
Ahmed

Re: Generating a constant load of tuples

Posted by Itai Frenkel <It...@forter.com>.

Conjecture - Could the bolt after the spout be the problem? Once the bolt input queue is full (I think the default is 1024 items), then it will pushback and Spout emit will block the spout thread.
You could measure the time it took to emit for diagnostics.

Also- do you have 2 CPUs.

Itai
________________________________________
From: Ahmed El Rheddane <ah...@imag.fr>
Sent: Thursday, October 23, 2014 11:52 AM
To: user@storm.apache.org
Subject: Generating a constant load of tuples

Hello again,

I am trying to generate a regulated load of tuples using a sleep in the
spout's nextTuple(). However, the observed results show that the number
of tuples periodically drops by about 200 hundred tuples (from 800 to 600).

I thought that this was due to the way I was monitoring Storm, but the
behvior has been confirmed by local spout logs.
I haven't set any limit on pending messages and I'm not using any ackers
anyways.

Any help would be very much appreciated.

Cheers,
Ahmed