You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Joe Witt <jo...@gmail.com> on 2015/11/03 10:41:52 UTC

Re: GetSQS causes high CPU usage

Adam,

Just wanted to follow up on this.  Have you had any better results and
should we put a JIRA in behind what you're seeing?

Thanks
Joe

On Tue, Oct 20, 2015 at 7:58 PM, Adam Lamar <ad...@gmail.com> wrote:
> Adam,
>
> Thanks for the reply!
>
> Amazon supports (and recommends) long polling on SQS queues[1]. The GetSQS
> code doesn't attempt long polling at all, but I wasn't sure if this was
> intentional or if the option had just never been added. With a 20 second
> long poll, the processor would make 3 requests per minute instead of 60,
> assuming the queue was empty during that time.
>
> Another data point - even during high CPU usage, the GetSQS processor was
> only making one request per second to SQS (verified via tcpdump). While not
> ideal from a billing perspective, doesn't it seem wrong that 1 request a
> second is causing such high CPU?
>
> Perhaps to muddy the waters a bit, I played with the run schedule yesterday,
> and even now that I've turned it back to 1 second, CPU usage is remaining
> low. Before I could start/stop GetSQS repeatedly and observe the high CPU
> usage, but now I can't reproduce it. If I'm able to consistently reproduce
> the issue in the future, I'll be sure to post again.
>
> Cheers,
> Adam
>
>
> [1]
> http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html
>
>
> On 10/20/15 4:37 AM, Adam Estrada wrote:
>>
>> Adam,
>>
>> I suspect that getSQS is polling Amazon to check for data. It's not
>> exactly like your standard message broker in that you have to force the
>> poll. Anyway, throw a wait time in there and see if that fixes it. This will
>> also help lower your monthly Amazon bill...
>>
>> Adam
>>
>>
>>> On Oct 19, 2015, at 11:41 PM, Adam Lamar <ad...@gmail.com> wrote:
>>>
>>> Hi everybody!
>>>
>>> I've been testing NiFi 0.3.0 with the GetSQS processor to fetch objects
>>> from an AWS bucket as they're created. My flow looks like this:
>>>
>>> GetSQS
>>> SplitJson
>>> ExtractText
>>> FetchS3Object
>>> PutFile
>>>
>>> I noticed that GetSQS causes a high amount of CPU usage - about 90% of
>>> one core. If I turn off GetSQS, CPU usage immediately drops to 2%. If I turn
>>> GetSQS back on with the run schedule at 10, it stays at 2%.
>>>
>>> Would it be worth using setWaitTimeSeconds [1] to make the SQS receive a
>>> blocking call? Alternatively, should GetSQS default to a longer run
>>> schedule?
>>>
>>> Cheers,
>>> Adam
>>>
>>>
>>> [1]
>>> http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/sqs/model/ReceiveMessageRequest.html#setWaitTimeSeconds(java.lang.Integer)
>
>

Re: GetSQS causes high CPU usage

Posted by Adam Lamar <ad...@gmail.com>.
Mark,

Definitely sounds like a possibility. In my usage, the GetSQS processor 
would yield often (because the SQS queue is low volume), but it was only 
run once per second (or 10 seconds, depending on the setting I used). 
I'm unsure if/how this explains why I saw high CPU usage for several 
days, then once I played with the settings, it dropped to a normal 
amount, but hopefully its another data point.

Cheers,
Adam

On 11/4/15 2:33 PM, Mark Payne wrote:
> Adam,
>
> I wonder if this ticket that I just created [1] is actually the same 
> issue that you're seeing here.
>
> When GetSQS determines there is nothing to do, it will "yield", 
> essentially pausing itself for some amount of time
> (by default 1 second). But the framework wasn't properly pausing the 
> processor. Instead, it continually ran a task that
> simply asks "are you yielded?" Since the answer was yes, it finished 
> that task and ran it again. This can cause the
> CPU usage to be significantly higher when there's nothing to do than 
> when there is actually work to do.
>
> Thanks
> -Mark
>
> [1] https://issues.apache.org/jira/browse/NIFI-1111
>
>
>> On Nov 3, 2015, at 1:05 PM, Adam Lamar <adamonduty@gmail.com 
>> <ma...@gmail.com>> wrote:
>>
>> Hey Joe,
>>
>> I think there are two possible JIRAs.
>>
>> 1) Add long polling support using setWaitTimeSeconds() - should be 
>> really easy. I can take a crack at a pull request. Here's a JIRA: 
>> https://issues.apache.org/jira/browse/NIFI-1103
>>
>> 2) Investigate the high CPU usage. I saw this initially for several 
>> days, but it went away after I adjusted the run schedule (from 1 
>> second to 10 seconds back to 1 second). I have CPU charts showing the 
>> high usage and corresponding drop, but I need to reproduce the issue.
>>
>> I'll circle back in a few days when I get some time to work on it.
>>
>> Cheers,
>> Adam
>>
>> On 11/3/15 2:41 AM, Joe Witt wrote:
>>> Adam,
>>>
>>> Just wanted to follow up on this.  Have you had any better results and
>>> should we put a JIRA in behind what you're seeing?
>>>
>>> Thanks
>>> Joe
>>>
>>> On Tue, Oct 20, 2015 at 7:58 PM, Adam Lamar <adamonduty@gmail.com 
>>> <ma...@gmail.com>> wrote:
>>>> Adam,
>>>>
>>>> Thanks for the reply!
>>>>
>>>> Amazon supports (and recommends) long polling on SQS queues[1]. The 
>>>> GetSQS
>>>> code doesn't attempt long polling at all, but I wasn't sure if this was
>>>> intentional or if the option had just never been added. With a 20 
>>>> second
>>>> long poll, the processor would make 3 requests per minute instead 
>>>> of 60,
>>>> assuming the queue was empty during that time.
>>>>
>>>> Another data point - even during high CPU usage, the GetSQS 
>>>> processor was
>>>> only making one request per second to SQS (verified via tcpdump). 
>>>> While not
>>>> ideal from a billing perspective, doesn't it seem wrong that 1 
>>>> request a
>>>> second is causing such high CPU?
>>>>
>>>> Perhaps to muddy the waters a bit, I played with the run schedule 
>>>> yesterday,
>>>> and even now that I've turned it back to 1 second, CPU usage is 
>>>> remaining
>>>> low. Before I could start/stop GetSQS repeatedly and observe the 
>>>> high CPU
>>>> usage, but now I can't reproduce it. If I'm able to consistently 
>>>> reproduce
>>>> the issue in the future, I'll be sure to post again.
>>>>
>>>> Cheers,
>>>> Adam
>>>>
>>>>
>>>> [1]
>>>> http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html
>>>>
>>>>
>>>> On 10/20/15 4:37 AM, Adam Estrada wrote:
>>>>> Adam,
>>>>>
>>>>> I suspect that getSQS is polling Amazon to check for data. It's not
>>>>> exactly like your standard message broker in that you have to 
>>>>> force the
>>>>> poll. Anyway, throw a wait time in there and see if that fixes it. 
>>>>> This will
>>>>> also help lower your monthly Amazon bill...
>>>>>
>>>>> Adam
>>>>>
>>>>>
>>>>>> On Oct 19, 2015, at 11:41 PM, Adam Lamar <ad...@gmail.com> 
>>>>>> wrote:
>>>>>>
>>>>>> Hi everybody!
>>>>>>
>>>>>> I've been testing NiFi 0.3.0 with the GetSQS processor to fetch 
>>>>>> objects
>>>>>> from an AWS bucket as they're created. My flow looks like this:
>>>>>>
>>>>>> GetSQS
>>>>>> SplitJson
>>>>>> ExtractText
>>>>>> FetchS3Object
>>>>>> PutFile
>>>>>>
>>>>>> I noticed that GetSQS causes a high amount of CPU usage - about 
>>>>>> 90% of
>>>>>> one core. If I turn off GetSQS, CPU usage immediately drops to 
>>>>>> 2%. If I turn
>>>>>> GetSQS back on with the run schedule at 10, it stays at 2%.
>>>>>>
>>>>>> Would it be worth using setWaitTimeSeconds [1] to make the SQS 
>>>>>> receive a
>>>>>> blocking call? Alternatively, should GetSQS default to a longer run
>>>>>> schedule?
>>>>>>
>>>>>> Cheers,
>>>>>> Adam
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/sqs/model/ReceiveMessageRequest.html#setWaitTimeSeconds(java.lang.Integer)
>>>>
>>
>


Re: GetSQS causes high CPU usage

Posted by Mark Payne <ma...@hotmail.com>.
Adam,

I wonder if this ticket that I just created [1] is actually the same issue that you're seeing here.

When GetSQS determines there is nothing to do, it will "yield", essentially pausing itself for some amount of time
(by default 1 second). But the framework wasn't properly pausing the processor. Instead, it continually ran a task that
simply asks "are you yielded?" Since the answer was yes, it finished that task and ran it again. This can cause the
CPU usage to be significantly higher when there's nothing to do than when there is actually work to do.

Thanks
-Mark

[1] https://issues.apache.org/jira/browse/NIFI-1111 <https://issues.apache.org/jira/browse/NIFI-1111>


> On Nov 3, 2015, at 1:05 PM, Adam Lamar <ad...@gmail.com> wrote:
> 
> Hey Joe,
> 
> I think there are two possible JIRAs.
> 
> 1) Add long polling support using setWaitTimeSeconds() - should be really easy. I can take a crack at a pull request. Here's a JIRA: https://issues.apache.org/jira/browse/NIFI-1103
> 
> 2) Investigate the high CPU usage. I saw this initially for several days, but it went away after I adjusted the run schedule (from 1 second to 10 seconds back to 1 second). I have CPU charts showing the high usage and corresponding drop, but I need to reproduce the issue.
> 
> I'll circle back in a few days when I get some time to work on it.
> 
> Cheers,
> Adam
> 
> On 11/3/15 2:41 AM, Joe Witt wrote:
>> Adam,
>> 
>> Just wanted to follow up on this.  Have you had any better results and
>> should we put a JIRA in behind what you're seeing?
>> 
>> Thanks
>> Joe
>> 
>> On Tue, Oct 20, 2015 at 7:58 PM, Adam Lamar <ad...@gmail.com> wrote:
>>> Adam,
>>> 
>>> Thanks for the reply!
>>> 
>>> Amazon supports (and recommends) long polling on SQS queues[1]. The GetSQS
>>> code doesn't attempt long polling at all, but I wasn't sure if this was
>>> intentional or if the option had just never been added. With a 20 second
>>> long poll, the processor would make 3 requests per minute instead of 60,
>>> assuming the queue was empty during that time.
>>> 
>>> Another data point - even during high CPU usage, the GetSQS processor was
>>> only making one request per second to SQS (verified via tcpdump). While not
>>> ideal from a billing perspective, doesn't it seem wrong that 1 request a
>>> second is causing such high CPU?
>>> 
>>> Perhaps to muddy the waters a bit, I played with the run schedule yesterday,
>>> and even now that I've turned it back to 1 second, CPU usage is remaining
>>> low. Before I could start/stop GetSQS repeatedly and observe the high CPU
>>> usage, but now I can't reproduce it. If I'm able to consistently reproduce
>>> the issue in the future, I'll be sure to post again.
>>> 
>>> Cheers,
>>> Adam
>>> 
>>> 
>>> [1]
>>> http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html
>>> 
>>> 
>>> On 10/20/15 4:37 AM, Adam Estrada wrote:
>>>> Adam,
>>>> 
>>>> I suspect that getSQS is polling Amazon to check for data. It's not
>>>> exactly like your standard message broker in that you have to force the
>>>> poll. Anyway, throw a wait time in there and see if that fixes it. This will
>>>> also help lower your monthly Amazon bill...
>>>> 
>>>> Adam
>>>> 
>>>> 
>>>>> On Oct 19, 2015, at 11:41 PM, Adam Lamar <ad...@gmail.com> wrote:
>>>>> 
>>>>> Hi everybody!
>>>>> 
>>>>> I've been testing NiFi 0.3.0 with the GetSQS processor to fetch objects
>>>>> from an AWS bucket as they're created. My flow looks like this:
>>>>> 
>>>>> GetSQS
>>>>> SplitJson
>>>>> ExtractText
>>>>> FetchS3Object
>>>>> PutFile
>>>>> 
>>>>> I noticed that GetSQS causes a high amount of CPU usage - about 90% of
>>>>> one core. If I turn off GetSQS, CPU usage immediately drops to 2%. If I turn
>>>>> GetSQS back on with the run schedule at 10, it stays at 2%.
>>>>> 
>>>>> Would it be worth using setWaitTimeSeconds [1] to make the SQS receive a
>>>>> blocking call? Alternatively, should GetSQS default to a longer run
>>>>> schedule?
>>>>> 
>>>>> Cheers,
>>>>> Adam
>>>>> 
>>>>> 
>>>>> [1]
>>>>> http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/sqs/model/ReceiveMessageRequest.html#setWaitTimeSeconds(java.lang.Integer)
>>> 
> 


Re: GetSQS causes high CPU usage

Posted by Adam Lamar <ad...@gmail.com>.
Hey Joe,

I think there are two possible JIRAs.

1) Add long polling support using setWaitTimeSeconds() - should be 
really easy. I can take a crack at a pull request. Here's a JIRA: 
https://issues.apache.org/jira/browse/NIFI-1103

2) Investigate the high CPU usage. I saw this initially for several 
days, but it went away after I adjusted the run schedule (from 1 second 
to 10 seconds back to 1 second). I have CPU charts showing the high 
usage and corresponding drop, but I need to reproduce the issue.

I'll circle back in a few days when I get some time to work on it.

Cheers,
Adam

On 11/3/15 2:41 AM, Joe Witt wrote:
> Adam,
>
> Just wanted to follow up on this.  Have you had any better results and
> should we put a JIRA in behind what you're seeing?
>
> Thanks
> Joe
>
> On Tue, Oct 20, 2015 at 7:58 PM, Adam Lamar <ad...@gmail.com> wrote:
>> Adam,
>>
>> Thanks for the reply!
>>
>> Amazon supports (and recommends) long polling on SQS queues[1]. The GetSQS
>> code doesn't attempt long polling at all, but I wasn't sure if this was
>> intentional or if the option had just never been added. With a 20 second
>> long poll, the processor would make 3 requests per minute instead of 60,
>> assuming the queue was empty during that time.
>>
>> Another data point - even during high CPU usage, the GetSQS processor was
>> only making one request per second to SQS (verified via tcpdump). While not
>> ideal from a billing perspective, doesn't it seem wrong that 1 request a
>> second is causing such high CPU?
>>
>> Perhaps to muddy the waters a bit, I played with the run schedule yesterday,
>> and even now that I've turned it back to 1 second, CPU usage is remaining
>> low. Before I could start/stop GetSQS repeatedly and observe the high CPU
>> usage, but now I can't reproduce it. If I'm able to consistently reproduce
>> the issue in the future, I'll be sure to post again.
>>
>> Cheers,
>> Adam
>>
>>
>> [1]
>> http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html
>>
>>
>> On 10/20/15 4:37 AM, Adam Estrada wrote:
>>> Adam,
>>>
>>> I suspect that getSQS is polling Amazon to check for data. It's not
>>> exactly like your standard message broker in that you have to force the
>>> poll. Anyway, throw a wait time in there and see if that fixes it. This will
>>> also help lower your monthly Amazon bill...
>>>
>>> Adam
>>>
>>>
>>>> On Oct 19, 2015, at 11:41 PM, Adam Lamar <ad...@gmail.com> wrote:
>>>>
>>>> Hi everybody!
>>>>
>>>> I've been testing NiFi 0.3.0 with the GetSQS processor to fetch objects
>>>> from an AWS bucket as they're created. My flow looks like this:
>>>>
>>>> GetSQS
>>>> SplitJson
>>>> ExtractText
>>>> FetchS3Object
>>>> PutFile
>>>>
>>>> I noticed that GetSQS causes a high amount of CPU usage - about 90% of
>>>> one core. If I turn off GetSQS, CPU usage immediately drops to 2%. If I turn
>>>> GetSQS back on with the run schedule at 10, it stays at 2%.
>>>>
>>>> Would it be worth using setWaitTimeSeconds [1] to make the SQS receive a
>>>> blocking call? Alternatively, should GetSQS default to a longer run
>>>> schedule?
>>>>
>>>> Cheers,
>>>> Adam
>>>>
>>>>
>>>> [1]
>>>> http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/sqs/model/ReceiveMessageRequest.html#setWaitTimeSeconds(java.lang.Integer)
>>