You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@qpid.apache.org by Fraser Adams <fr...@blueyonder.co.uk> on 2011/09/21 21:24:28 UTC

Strategies for managing and monitoring large Qpid topologies in mission critical systems

Hello all,
I'm seeking thoughts from those in the community who have been using 
Qpid in mission critical systems.

So from my observations Qpid seems pretty stable, but there's alway the 
possibility of exciting little gotchas especially as the complexity 
grows and one starts to use fairly complex federated topologies of large 
numbers of brokers. As an example in the early days we got bitten a lot 
by "blown" links when consumers went down and queues filled to capacity. 
We're working past that with queue routes/circular queues/servers with 
largish memory etc.

Despite all that I'm expecting some gotchas so I want to turn my 
attention to managing/monitoring/system health check.

So what sort of things are others in similar positions using?

The core tools qpid-config, qpid-route, qpid-stat etc. are very useful 
but they are quite "mandraulic" so are people manually using those and 
reactively solving problems or is there much in the way of proactive 
management (fixing things before the clients shout :-) Are people 
scripting the core tools or writing their own stuff? I've been doing a 
lot with QMF2 lately and there's clearly a huge potential with that, but 
I don't want to go about reinventing wheels.

Has anyone else integrated Qpid with Enterprise System Management tools 
such as HP OpenView/OperationsManager? If so are they writing bespoke 
QMF Console apps or scripting things like qpid-stat, qpid-printevents etc.

I'd be interested to know what the recommended approaches are and 
whether there are any "sister projects" looking into this - I'd like to 
keep things cohesive with best practice and I'd like to avoid going down 
divergent paths.

Hope to hear from you
Cheers
Frase

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Flow Control/Performance tuning help required

Posted by Gordon Sim <gs...@redhat.com>.
On 09/28/2011 07:47 AM, Gordon Sim wrote:
> On 09/27/2011 07:29 PM, Fraser Adams wrote:
>> Re "A further question is whether you need acknowledgements at all? "
>> surely if I don't acknowledge at all then the messages just remain on
>> the broker and it potentially attempts to resend them. Certainly if I
>> comment out the session.acknowledge(); line in my ItemConsumer the
>> memory usage goes up and eventually I go into swap.
>>
>> Is there any way to set things up to say that I'm not going to
>> acknowledge receipt? Is your last sentence suggesting that I could
>> configure things in such a way?? (I can tolerate some message loss in
>> the cases where I'm trying to eke maximum throughput)
>
> Yes, that was the question. In the event that the messages are being
> deleted from under you anyway (due to ring policy), your acknowledging
> them is not actually giving you anything.

I should have added that you can turn of the need for acknowledgements 
by adding 'reliability: unreliable' to the link options of your address.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Flow Control/Performance tuning help required

Posted by Gordon Sim <gs...@redhat.com>.
On 09/27/2011 07:29 PM, Fraser Adams wrote:
> Hi Gordon,
> This is slightly concerning. This seems a bit of a catch 22: for optimum
> performance it seems best to acknowledge fairly infrequently.

It is a balance. If you leave it too infrequent then you have a much 
larger set of unacknowledged messages which can affect the processing of 
the acknowledgements.

Batching say 50 or 100 acknowledgements generally gives pretty good 
performance I find. (On a similar note, you may find slightly lowering 
the capacity of the receiver from 500 to say 256 may also improve things 
a fraction)

> Re "A further question is whether you need acknowledgements at all? "
> surely if I don't acknowledge at all then the messages just remain on
> the broker and it potentially attempts to resend them. Certainly if I
> comment out the session.acknowledge(); line in my ItemConsumer the
> memory usage goes up and eventually I go into swap.
>
> Is there any way to set things up to say that I'm not going to
> acknowledge receipt? Is your last sentence suggesting that I could
> configure things in such a way?? (I can tolerate some message loss in
> the cases where I'm trying to eke maximum throughput)

Yes, that was the question. In the event that the messages are being 
deleted from under you anyway (due to ring policy), your acknowledging 
them is not actually giving you anything.

A lot depends on whether for your use case having messages dropped due 
to reaching capacity is a common or very rare occurrence.

> As an aside I'm still pretty sure that one of my colleagues who's been
> using qpid::client had similar issues and he sorted it by fiddling with
> the flow control parameters, but I've not seen his code so he might have
> fiddled with other things too.

Perhaps autoAck?

> I'd still really *love* to know how RedHat configured the MRG whitepaper
> tests (preferably in minute detail :-) are you aware of anyone who has
> *actually* reproduced the figures of 380,000 256 octet messages in and
> out on an 8 core box. I know my laptop has only two cores but 17,000 is
> a *long* way off 380,000

For a start I believe they involve multiple pairs of producing and 
subscribing connections. They also don't use ring queues or any similar 
'special' functionality. I'm not that familiar with the whitepaper but I 
believe it will have been using qpid-perftest.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Flow Control/Performance tuning help required

Posted by Fraser Adams <fr...@blueyonder.co.uk>.
Hi Gordon,
This is slightly concerning. This seems a bit of a catch 22: for optimum 
performance it seems best to acknowledge fairly infrequently.

Re "A further question is whether you need acknowledgements at all? " 
surely if I don't acknowledge at all then the messages just remain on 
the broker and it potentially attempts to resend them. Certainly if I 
comment out the session.acknowledge(); line in my ItemConsumer the 
memory usage goes up and eventually I go into swap.

Is there any way to set things up to say that I'm not going to 
acknowledge receipt? Is your last sentence suggesting that I could 
configure things in such a way?? (I can tolerate some message loss in 
the cases where I'm trying to eke maximum throughput)

As an aside I'm still pretty sure that one of my colleagues who's been 
using qpid::client had similar issues and he sorted it by fiddling with 
the flow control parameters, but I've not seen his code so he might have 
fiddled with other things too.



I'd still really *love* to know how RedHat configured the MRG whitepaper 
tests (preferably in minute detail :-) are you aware of anyone who has 
*actually* reproduced the figures of 380,000 256 octet messages in and 
out on an 8 core box. I know my laptop has only two cores but 17,000 is 
a *long* way off 380,000

I don't think I'm completely stupid, but the examples I attached in the 
earlier post are the fastest I've got Qpid to go. Am I missing 
something? Can you see anything obviously wrong with my code (or queue 
config)

Cheers,
Frase


Gordon Sim wrote:
> On 09/25/2011 07:43 PM, Fraser Adams wrote:
>> This is really freaky why does the consumer performance drop off
>> dramatically when the ring queue is full. Is it a flow control thing?
>
> No it is not a flow control related issue (that is controlled through 
> the receivers capacity by the way).
>
> When a consumer acknowledges a message on a ring queue, the broker 
> first checks to see whether the messages is still enqueued or whether 
> it has already been displaced by a newer message.
>
> That check is far from optimal, particularly in the case where the 
> messages has been displaced already. I suspect this is what you are 
> seeing.
>
> One suggestion is to acknowledge messages more frequently. The theory 
> here is that this reduces the chance that you acknowledge a message 
> that has already been removed from the queue. Whether in practice this 
> will make a significant difference, I can't say for sure.
>
> A further question is whether you need acknowledgements at all?
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
>
>


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Flow Control/Performance tuning help required

Posted by Gordon Sim <gs...@redhat.com>.
On 09/25/2011 07:43 PM, Fraser Adams wrote:
> This is really freaky why does the consumer performance drop off
> dramatically when the ring queue is full. Is it a flow control thing?

No it is not a flow control related issue (that is controlled through 
the receivers capacity by the way).

When a consumer acknowledges a message on a ring queue, the broker first 
checks to see whether the messages is still enqueued or whether it has 
already been displaced by a newer message.

That check is far from optimal, particularly in the case where the 
messages has been displaced already. I suspect this is what you are seeing.

One suggestion is to acknowledge messages more frequently. The theory 
here is that this reduces the chance that you acknowledge a message that 
has already been removed from the queue. Whether in practice this will 
make a significant difference, I can't say for sure.

A further question is whether you need acknowledgements at all?

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Flow Control/Performance tuning help required

Posted by Fraser Adams <fr...@blueyonder.co.uk>.
Hi Andy/Gordon et al.
I really could do with some help from performance gurus.....

OK So I think I've reproduced some of the symptoms I described in my 
earlier email. I used the attached demo producer/consumer 
qpid::messaging clients.

So if I run ./ItemConsumer that creates my perftest queue and waits for 
messages, then if I run ./ItemProducer I get messages whizzing through 
at a decent enough rate.

My box at home is a Dell laptop with Intel(R) Core(TM)2 Duo CPU  P7450  
@ 2.13GHz, cpu MHz 800.000, cache size 3072 KB and running qpidd plus an 
instance of ItemProducer and ItemConsumer I'm getting ~17,000 900 octet 
messages per second and top shows qpidd only goes as far as ~60%

Now the weirder symptom is that if I then kill ItemConsumer, run 
ItemProducer for 200,000 messages or so to fill the queue then run 
ItemConsumer it shows that it has acknowledged the first batch of 20000 
or so messages then it hangs and top shows qpidd maxing out at 100%.
 
At that point ItemProducer still seems to be producing, but ItemConsumer 
only periodically acks. If I then kill ItemProducer I get ItemConsumer 
ramping up again.

I've set the receiver capacity to 500, though I've tried several values 
of this but doesn't seem to make much difference to this.

This is really freaky why does the consumer performance drop off 
dramatically when the ring queue is full. Is it a flow control thing? 
How can I disable it in qpid::messaging????

Another freaky thing is that I've got tcp-nodelay enabled in my clients, 
but I found that setting --tcp-nodelay on the broker (testing with 0.10) 
the performance actually GOES DOWN!!!! So setting --tcp-nodelay on the 
broker helps if I don't set it in the clients but not as much is I do 
set it in the clients and if I set it on both the performance is less 
than if I just set it on the  clients ---- weird!!!!!

So c'mon folks rise to the challenge and suggest some settings to make 
my clients smoke (and stop being weird when the queue fills).

And how do I get qpidd on a multi-core box show more than 100% CPU - do 
I have to run multiple qpidd instances - I can't believe that's the case????

MTIA
Frase

Andy Goldstein wrote:
> On Sep 23, 2011, at 9:11 AM, Fraser Adams wrote:
>
>   
>> I'll mention that to the guys when I get back to the office. Though it seems a bit counterintuitive to me I'd have thought that having a lower number of worker threads wouldn't utilise the available cores. By "logic" running two (or even eight) worker threads on your 48 core server seems low - any idea what's going on to explain your results??
>>     
>
> I can't say for sure, but I would guess there's maybe more lock contention going on in the broker when you have more threads.
>
>   
>> So have you reproduced the MRG paper results? That paper, which is over three years old now, has figures of 380,000 256 octet messages in plus out on a 2 x 4 core Xeon box. We've not come *close* to that figure and my developers are far from dummies. The paper describes the methodology quite well, but doesn't quite spell out as a tutorial exactly what the setup was.
>>     
>
> What numbers are you getting, and how are you testing?
>
>
>   

Re: Flow Control/Performance tuning help required

Posted by Andy Goldstein <ag...@redhat.com>.
On Sep 23, 2011, at 9:11 AM, Fraser Adams wrote:

> 
> I'll mention that to the guys when I get back to the office. Though it seems a bit counterintuitive to me I'd have thought that having a lower number of worker threads wouldn't utilise the available cores. By "logic" running two (or even eight) worker threads on your 48 core server seems low - any idea what's going on to explain your results??

I can't say for sure, but I would guess there's maybe more lock contention going on in the broker when you have more threads.

> So have you reproduced the MRG paper results? That paper, which is over three years old now, has figures of 380,000 256 octet messages in plus out on a 2 x 4 core Xeon box. We've not come *close* to that figure and my developers are far from dummies. The paper describes the methodology quite well, but doesn't quite spell out as a tutorial exactly what the setup was.

What numbers are you getting, and how are you testing?

> 
> I don't suppose you (or anyone else) has any help on the other part of my question about consumer flow control??

I'm not too familiar with consumer flow control, unless you're talking about using a prefetch capacity on a receiver.

Andy


> Cheers,
> Frase
> 
> Andy Goldstein wrote:
>> As an experiment, try lowering the # of worker threads for the broker.  For example, we saw an order of magnitude increase in performance when we dropped worker threads from 8 to 2 (on a 48-core server).  Our test involved creating a ring queue with a max queue count of 250,000 messages.  We pre-filled the queue with 259 byte messages, and then had a multi-threaded client start at least 3 threads, 1 connection/session/sender per thread, and had them try to send as many 259 byte messages/second as possible.  Decreasing the # of worker threads in the broker gave us better throughput.
>> 
>> Andy
>> 
>> On Sep 23, 2011, at 8:05 AM, Fraser Adams wrote:
>> 
>>  
>>> Hi Andy,
>>> I'm afraid that I can't tell you for sure as I'm doing this a bit by "remote control" (I've tasked some of my developers to try and replicate the MRG whitepaper throughput results to give us a baseline top level performance figure).
>>> 
>>> However when I last spoke to them they had tried sending a load of ~900 octet messages to a ring queue set to 2GB, but to rule out any memory issues (shouldn't be as the box has 24GB) they have also tried with a ring queue of the default size of 100M - they got the same problem, it just happened a lot sooner obviously.
>>> 
>>> Fraser
>>> 
>>> 
>>> Andy Goldstein wrote:
>>>    
>>>> Hi Fraser,
>>>> 
>>>> How many messages can the ring queue hold before it starts dropping old messages to make room for new ones?
>>>> 
>>>> Andy
>>>> 
>>>> On Sep 23, 2011, at 5:21 AM, Fraser Adams wrote:
>>>> 
>>>>       
>>>>> Hello all,
>>>>> I was chatting to some colleagues yesterday who are trying to do some stress testing and have noticed some weird results.
>>>>> 
>>>>> I'm afraid I've not personally reproduced this yet, but I wanted to post on a Friday whilst the list was more active.
>>>>> 
>>>>> The set up is firing off messages of ~900 octets in size into a queue with a ring limit policy and I'm pretty sure they are using Qpid 0.8
>>>>> 
>>>>> As I understand it they have a few producers and a consumers and the "steady state" message rate is OKish, but if they kill off a couple of consumers to force the queue to start filling what seems to happen (as described to me) is that when the (ring) queue fills up to its limit (and I guess starts overwriting) the consumer rate plummets massively.
>>>>> 
>>>>> As I say I've not personally tried this yet, but as it happens another colleague was doing something independently and he reported something similar. He was using the C++ qpid::client API and from what I can gather did a bit of digging and found a command to disable consumer flow control, which seemed to solve his particular issue.
>>>>> 
>>>>> 
>>>>> Do the scenarios above sound like flow control issues? I'm afraid I've not looked much at this and the only documentation I can find relates to the producer flow control feature introduced in 0.10 which isn't applicable here as a) the issues were seen in a 0.8 broker and b) as far as the doc goes producer flow control isn't applied on ring queues.
>>>>> 
>>>>> The colleague who did the tinkering on qpid::client I believe figured it out from the low-level doxygen API documentation, but I've not seen anything in the higher level documents and I've certainly not seen anything in the qpid::messaging or JMS stuff (which is mostly where my own experience comes from). I'd definitely like to be able to disable it from Java and qpid::messaging too.
>>>>> 
>>>>> 
>>>>> I'd appreciate a brain dump of distilled flow control knowledge that I can pass on if that's possible!!!
>>>>> 
>>>>> 
>>>>> As an aside, another thing seemed slightly weird to me. My colleagues are running an a 16 core Linux box and the worker threads are set to 17 as expected however despite running with I think 8 producers and 32 consumers the CPU usage reported by top maxes out at 113% this seems massively low on a 16 core box and I'd have hoped to see a massively higher message rate than they are actually seeing and the CPU usage getting closer to 1600%. Is there something "special" that needs to be done to make best use out of a nice big multicore Xeon box. IIRC the MRG whitepaper mentions "Use taskset to start qpid-daemon on all cpus". This isn't something I'm familiar with but looks like it relates to CPU affinity, but to my mind that doesn't account for maxing out at only a fraction of the available CPU capacity (it's not network bound BTW).
>>>>> 
>>>>> 
>>>>> Are there any tutorials on how to obtain the absolute maximum super turbo message throughput :-) We're not even coming *close* to the figures quoted in the MRG whitepaper despite running of more powerful hardware, so I'm assuming we're doing something wrong unless the MRG figures are massively exaggerated.
>>>>> 
>>>>> 
>>>>> Many thanks
>>>>> Frase
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> Apache Qpid - AMQP Messaging Implementation
>>>>> Project:      http://qpid.apache.org
>>>>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>>>>> 
>>>>>           
>>>> ---------------------------------------------------------------------
>>>> Apache Qpid - AMQP Messaging Implementation
>>>> Project:      http://qpid.apache.org
>>>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>>>> 
>>>> 
>>>>       
>>> ---------------------------------------------------------------------
>>> Apache Qpid - AMQP Messaging Implementation
>>> Project:      http://qpid.apache.org
>>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>>> 
>>>    
>> 
>> 
>> ---------------------------------------------------------------------
>> Apache Qpid - AMQP Messaging Implementation
>> Project:      http://qpid.apache.org
>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>> 
>> 
>>  
> 
> 
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
> 


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Flow Control/Performance tuning help required

Posted by Fraser Adams <fr...@blueyonder.co.uk>.
I'll mention that to the guys when I get back to the office. Though it 
seems a bit counterintuitive to me I'd have thought that having a lower 
number of worker threads wouldn't utilise the available cores. By 
"logic" running two (or even eight) worker threads on your 48 core 
server seems low - any idea what's going on to explain your results??


So have you reproduced the MRG paper results? That paper, which is over 
three years old now, has figures of 380,000 256 octet messages in plus 
out on a 2 x 4 core Xeon box. We've not come *close* to that figure and 
my developers are far from dummies. The paper describes the methodology 
quite well, but doesn't quite spell out as a tutorial exactly what the 
setup was.


I don't suppose you (or anyone else) has any help on the other part of 
my question about consumer flow control??

Cheers,
Frase

Andy Goldstein wrote:
> As an experiment, try lowering the # of worker threads for the broker.  For example, we saw an order of magnitude increase in performance when we dropped worker threads from 8 to 2 (on a 48-core server).  Our test involved creating a ring queue with a max queue count of 250,000 messages.  We pre-filled the queue with 259 byte messages, and then had a multi-threaded client start at least 3 threads, 1 connection/session/sender per thread, and had them try to send as many 259 byte messages/second as possible.  Decreasing the # of worker threads in the broker gave us better throughput.
>
> Andy
>
> On Sep 23, 2011, at 8:05 AM, Fraser Adams wrote:
>
>   
>> Hi Andy,
>> I'm afraid that I can't tell you for sure as I'm doing this a bit by "remote control" (I've tasked some of my developers to try and replicate the MRG whitepaper throughput results to give us a baseline top level performance figure).
>>
>> However when I last spoke to them they had tried sending a load of ~900 octet messages to a ring queue set to 2GB, but to rule out any memory issues (shouldn't be as the box has 24GB) they have also tried with a ring queue of the default size of 100M - they got the same problem, it just happened a lot sooner obviously.
>>
>> Fraser
>>
>>
>> Andy Goldstein wrote:
>>     
>>> Hi Fraser,
>>>
>>> How many messages can the ring queue hold before it starts dropping old messages to make room for new ones?
>>>
>>> Andy
>>>
>>> On Sep 23, 2011, at 5:21 AM, Fraser Adams wrote:
>>>
>>>  
>>>       
>>>> Hello all,
>>>> I was chatting to some colleagues yesterday who are trying to do some stress testing and have noticed some weird results.
>>>>
>>>> I'm afraid I've not personally reproduced this yet, but I wanted to post on a Friday whilst the list was more active.
>>>>
>>>> The set up is firing off messages of ~900 octets in size into a queue with a ring limit policy and I'm pretty sure they are using Qpid 0.8
>>>>
>>>> As I understand it they have a few producers and a consumers and the "steady state" message rate is OKish, but if they kill off a couple of consumers to force the queue to start filling what seems to happen (as described to me) is that when the (ring) queue fills up to its limit (and I guess starts overwriting) the consumer rate plummets massively.
>>>>
>>>> As I say I've not personally tried this yet, but as it happens another colleague was doing something independently and he reported something similar. He was using the C++ qpid::client API and from what I can gather did a bit of digging and found a command to disable consumer flow control, which seemed to solve his particular issue.
>>>>
>>>>
>>>> Do the scenarios above sound like flow control issues? I'm afraid I've not looked much at this and the only documentation I can find relates to the producer flow control feature introduced in 0.10 which isn't applicable here as a) the issues were seen in a 0.8 broker and b) as far as the doc goes producer flow control isn't applied on ring queues.
>>>>
>>>> The colleague who did the tinkering on qpid::client I believe figured it out from the low-level doxygen API documentation, but I've not seen anything in the higher level documents and I've certainly not seen anything in the qpid::messaging or JMS stuff (which is mostly where my own experience comes from). I'd definitely like to be able to disable it from Java and qpid::messaging too.
>>>>
>>>>
>>>> I'd appreciate a brain dump of distilled flow control knowledge that I can pass on if that's possible!!!
>>>>
>>>>
>>>> As an aside, another thing seemed slightly weird to me. My colleagues are running an a 16 core Linux box and the worker threads are set to 17 as expected however despite running with I think 8 producers and 32 consumers the CPU usage reported by top maxes out at 113% this seems massively low on a 16 core box and I'd have hoped to see a massively higher message rate than they are actually seeing and the CPU usage getting closer to 1600%. Is there something "special" that needs to be done to make best use out of a nice big multicore Xeon box. IIRC the MRG whitepaper mentions "Use taskset to start qpid-daemon on all cpus". This isn't something I'm familiar with but looks like it relates to CPU affinity, but to my mind that doesn't account for maxing out at only a fraction of the available CPU capacity (it's not network bound BTW).
>>>>
>>>>
>>>> Are there any tutorials on how to obtain the absolute maximum super turbo message throughput :-) We're not even coming *close* to the figures quoted in the MRG whitepaper despite running of more powerful hardware, so I'm assuming we're doing something wrong unless the MRG figures are massively exaggerated.
>>>>
>>>>
>>>> Many thanks
>>>> Frase
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> Apache Qpid - AMQP Messaging Implementation
>>>> Project:      http://qpid.apache.org
>>>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>>>>
>>>>    
>>>>         
>>> ---------------------------------------------------------------------
>>> Apache Qpid - AMQP Messaging Implementation
>>> Project:      http://qpid.apache.org
>>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>>>
>>>
>>>  
>>>       
>> ---------------------------------------------------------------------
>> Apache Qpid - AMQP Messaging Implementation
>> Project:      http://qpid.apache.org
>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>>
>>     
>
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
>
>
>   


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Flow Control/Performance tuning help required

Posted by Andy Goldstein <ag...@redhat.com>.
As an experiment, try lowering the # of worker threads for the broker.  For example, we saw an order of magnitude increase in performance when we dropped worker threads from 8 to 2 (on a 48-core server).  Our test involved creating a ring queue with a max queue count of 250,000 messages.  We pre-filled the queue with 259 byte messages, and then had a multi-threaded client start at least 3 threads, 1 connection/session/sender per thread, and had them try to send as many 259 byte messages/second as possible.  Decreasing the # of worker threads in the broker gave us better throughput.

Andy

On Sep 23, 2011, at 8:05 AM, Fraser Adams wrote:

> Hi Andy,
> I'm afraid that I can't tell you for sure as I'm doing this a bit by "remote control" (I've tasked some of my developers to try and replicate the MRG whitepaper throughput results to give us a baseline top level performance figure).
> 
> However when I last spoke to them they had tried sending a load of ~900 octet messages to a ring queue set to 2GB, but to rule out any memory issues (shouldn't be as the box has 24GB) they have also tried with a ring queue of the default size of 100M - they got the same problem, it just happened a lot sooner obviously.
> 
> Fraser
> 
> 
> Andy Goldstein wrote:
>> Hi Fraser,
>> 
>> How many messages can the ring queue hold before it starts dropping old messages to make room for new ones?
>> 
>> Andy
>> 
>> On Sep 23, 2011, at 5:21 AM, Fraser Adams wrote:
>> 
>>  
>>> Hello all,
>>> I was chatting to some colleagues yesterday who are trying to do some stress testing and have noticed some weird results.
>>> 
>>> I'm afraid I've not personally reproduced this yet, but I wanted to post on a Friday whilst the list was more active.
>>> 
>>> The set up is firing off messages of ~900 octets in size into a queue with a ring limit policy and I'm pretty sure they are using Qpid 0.8
>>> 
>>> As I understand it they have a few producers and a consumers and the "steady state" message rate is OKish, but if they kill off a couple of consumers to force the queue to start filling what seems to happen (as described to me) is that when the (ring) queue fills up to its limit (and I guess starts overwriting) the consumer rate plummets massively.
>>> 
>>> As I say I've not personally tried this yet, but as it happens another colleague was doing something independently and he reported something similar. He was using the C++ qpid::client API and from what I can gather did a bit of digging and found a command to disable consumer flow control, which seemed to solve his particular issue.
>>> 
>>> 
>>> Do the scenarios above sound like flow control issues? I'm afraid I've not looked much at this and the only documentation I can find relates to the producer flow control feature introduced in 0.10 which isn't applicable here as a) the issues were seen in a 0.8 broker and b) as far as the doc goes producer flow control isn't applied on ring queues.
>>> 
>>> The colleague who did the tinkering on qpid::client I believe figured it out from the low-level doxygen API documentation, but I've not seen anything in the higher level documents and I've certainly not seen anything in the qpid::messaging or JMS stuff (which is mostly where my own experience comes from). I'd definitely like to be able to disable it from Java and qpid::messaging too.
>>> 
>>> 
>>> I'd appreciate a brain dump of distilled flow control knowledge that I can pass on if that's possible!!!
>>> 
>>> 
>>> As an aside, another thing seemed slightly weird to me. My colleagues are running an a 16 core Linux box and the worker threads are set to 17 as expected however despite running with I think 8 producers and 32 consumers the CPU usage reported by top maxes out at 113% this seems massively low on a 16 core box and I'd have hoped to see a massively higher message rate than they are actually seeing and the CPU usage getting closer to 1600%. Is there something "special" that needs to be done to make best use out of a nice big multicore Xeon box. IIRC the MRG whitepaper mentions "Use taskset to start qpid-daemon on all cpus". This isn't something I'm familiar with but looks like it relates to CPU affinity, but to my mind that doesn't account for maxing out at only a fraction of the available CPU capacity (it's not network bound BTW).
>>> 
>>> 
>>> Are there any tutorials on how to obtain the absolute maximum super turbo message throughput :-) We're not even coming *close* to the figures quoted in the MRG whitepaper despite running of more powerful hardware, so I'm assuming we're doing something wrong unless the MRG figures are massively exaggerated.
>>> 
>>> 
>>> Many thanks
>>> Frase
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> Apache Qpid - AMQP Messaging Implementation
>>> Project:      http://qpid.apache.org
>>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>>> 
>>>    
>> 
>> 
>> ---------------------------------------------------------------------
>> Apache Qpid - AMQP Messaging Implementation
>> Project:      http://qpid.apache.org
>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>> 
>> 
>>  
> 
> 
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
> 


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Flow Control/Performance tuning help required

Posted by Fraser Adams <fr...@blueyonder.co.uk>.
Hi Andy,
I'm afraid that I can't tell you for sure as I'm doing this a bit by 
"remote control" (I've tasked some of my developers to try and replicate 
the MRG whitepaper throughput results to give us a baseline top level 
performance figure).

However when I last spoke to them they had tried sending a load of ~900 
octet messages to a ring queue set to 2GB, but to rule out any memory 
issues (shouldn't be as the box has 24GB) they have also tried with a 
ring queue of the default size of 100M - they got the same problem, it 
just happened a lot sooner obviously.

Fraser


Andy Goldstein wrote:
> Hi Fraser,
>
> How many messages can the ring queue hold before it starts dropping old messages to make room for new ones?
>
> Andy
>
> On Sep 23, 2011, at 5:21 AM, Fraser Adams wrote:
>
>   
>> Hello all,
>> I was chatting to some colleagues yesterday who are trying to do some stress testing and have noticed some weird results.
>>
>> I'm afraid I've not personally reproduced this yet, but I wanted to post on a Friday whilst the list was more active.
>>
>> The set up is firing off messages of ~900 octets in size into a queue with a ring limit policy and I'm pretty sure they are using Qpid 0.8
>>
>> As I understand it they have a few producers and a consumers and the "steady state" message rate is OKish, but if they kill off a couple of consumers to force the queue to start filling what seems to happen (as described to me) is that when the (ring) queue fills up to its limit (and I guess starts overwriting) the consumer rate plummets massively.
>>
>> As I say I've not personally tried this yet, but as it happens another colleague was doing something independently and he reported something similar. He was using the C++ qpid::client API and from what I can gather did a bit of digging and found a command to disable consumer flow control, which seemed to solve his particular issue.
>>
>>
>> Do the scenarios above sound like flow control issues? I'm afraid I've not looked much at this and the only documentation I can find relates to the producer flow control feature introduced in 0.10 which isn't applicable here as a) the issues were seen in a 0.8 broker and b) as far as the doc goes producer flow control isn't applied on ring queues.
>>
>> The colleague who did the tinkering on qpid::client I believe figured it out from the low-level doxygen API documentation, but I've not seen anything in the higher level documents and I've certainly not seen anything in the qpid::messaging or JMS stuff (which is mostly where my own experience comes from). I'd definitely like to be able to disable it from Java and qpid::messaging too.
>>
>>
>> I'd appreciate a brain dump of distilled flow control knowledge that I can pass on if that's possible!!!
>>
>>
>> As an aside, another thing seemed slightly weird to me. My colleagues are running an a 16 core Linux box and the worker threads are set to 17 as expected however despite running with I think 8 producers and 32 consumers the CPU usage reported by top maxes out at 113% this seems massively low on a 16 core box and I'd have hoped to see a massively higher message rate than they are actually seeing and the CPU usage getting closer to 1600%. Is there something "special" that needs to be done to make best use out of a nice big multicore Xeon box. IIRC the MRG whitepaper mentions "Use taskset to start qpid-daemon on all cpus". This isn't something I'm familiar with but looks like it relates to CPU affinity, but to my mind that doesn't account for maxing out at only a fraction of the available CPU capacity (it's not network bound BTW).
>>
>>
>> Are there any tutorials on how to obtain the absolute maximum super turbo message throughput :-) We're not even coming *close* to the figures quoted in the MRG whitepaper despite running of more powerful hardware, so I'm assuming we're doing something wrong unless the MRG figures are massively exaggerated.
>>
>>
>> Many thanks
>> Frase
>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> Apache Qpid - AMQP Messaging Implementation
>> Project:      http://qpid.apache.org
>> Use/Interact: mailto:users-subscribe@qpid.apache.org
>>
>>     
>
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
>
>
>   


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Flow Control/Performance tuning help required

Posted by Andy Goldstein <ag...@redhat.com>.
Hi Fraser,

How many messages can the ring queue hold before it starts dropping old messages to make room for new ones?

Andy

On Sep 23, 2011, at 5:21 AM, Fraser Adams wrote:

> Hello all,
> I was chatting to some colleagues yesterday who are trying to do some stress testing and have noticed some weird results.
> 
> I'm afraid I've not personally reproduced this yet, but I wanted to post on a Friday whilst the list was more active.
> 
> The set up is firing off messages of ~900 octets in size into a queue with a ring limit policy and I'm pretty sure they are using Qpid 0.8
> 
> As I understand it they have a few producers and a consumers and the "steady state" message rate is OKish, but if they kill off a couple of consumers to force the queue to start filling what seems to happen (as described to me) is that when the (ring) queue fills up to its limit (and I guess starts overwriting) the consumer rate plummets massively.
> 
> As I say I've not personally tried this yet, but as it happens another colleague was doing something independently and he reported something similar. He was using the C++ qpid::client API and from what I can gather did a bit of digging and found a command to disable consumer flow control, which seemed to solve his particular issue.
> 
> 
> Do the scenarios above sound like flow control issues? I'm afraid I've not looked much at this and the only documentation I can find relates to the producer flow control feature introduced in 0.10 which isn't applicable here as a) the issues were seen in a 0.8 broker and b) as far as the doc goes producer flow control isn't applied on ring queues.
> 
> The colleague who did the tinkering on qpid::client I believe figured it out from the low-level doxygen API documentation, but I've not seen anything in the higher level documents and I've certainly not seen anything in the qpid::messaging or JMS stuff (which is mostly where my own experience comes from). I'd definitely like to be able to disable it from Java and qpid::messaging too.
> 
> 
> I'd appreciate a brain dump of distilled flow control knowledge that I can pass on if that's possible!!!
> 
> 
> As an aside, another thing seemed slightly weird to me. My colleagues are running an a 16 core Linux box and the worker threads are set to 17 as expected however despite running with I think 8 producers and 32 consumers the CPU usage reported by top maxes out at 113% this seems massively low on a 16 core box and I'd have hoped to see a massively higher message rate than they are actually seeing and the CPU usage getting closer to 1600%. Is there something "special" that needs to be done to make best use out of a nice big multicore Xeon box. IIRC the MRG whitepaper mentions "Use taskset to start qpid-daemon on all cpus". This isn't something I'm familiar with but looks like it relates to CPU affinity, but to my mind that doesn't account for maxing out at only a fraction of the available CPU capacity (it's not network bound BTW).
> 
> 
> Are there any tutorials on how to obtain the absolute maximum super turbo message throughput :-) We're not even coming *close* to the figures quoted in the MRG whitepaper despite running of more powerful hardware, so I'm assuming we're doing something wrong unless the MRG figures are massively exaggerated.
> 
> 
> Many thanks
> Frase
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
> 


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Flow Control/Performance tuning help required

Posted by Fraser Adams <fr...@blueyonder.co.uk>.
Hello all,
I was chatting to some colleagues yesterday who are trying to do some 
stress testing and have noticed some weird results.

I'm afraid I've not personally reproduced this yet, but I wanted to post 
on a Friday whilst the list was more active.

The set up is firing off messages of ~900 octets in size into a queue 
with a ring limit policy and I'm pretty sure they are using Qpid 0.8

As I understand it they have a few producers and a consumers and the 
"steady state" message rate is OKish, but if they kill off a couple of 
consumers to force the queue to start filling what seems to happen (as 
described to me) is that when the (ring) queue fills up to its limit 
(and I guess starts overwriting) the consumer rate plummets massively.

As I say I've not personally tried this yet, but as it happens another 
colleague was doing something independently and he reported something 
similar. He was using the C++ qpid::client API and from what I can 
gather did a bit of digging and found a command to disable consumer flow 
control, which seemed to solve his particular issue.


Do the scenarios above sound like flow control issues? I'm afraid I've 
not looked much at this and the only documentation I can find relates to 
the producer flow control feature introduced in 0.10 which isn't 
applicable here as a) the issues were seen in a 0.8 broker and b) as far 
as the doc goes producer flow control isn't applied on ring queues.

The colleague who did the tinkering on qpid::client I believe figured it 
out from the low-level doxygen API documentation, but I've not seen 
anything in the higher level documents and I've certainly not seen 
anything in the qpid::messaging or JMS stuff (which is mostly where my 
own experience comes from). I'd definitely like to be able to disable it 
from Java and qpid::messaging too.


I'd appreciate a brain dump of distilled flow control knowledge that I 
can pass on if that's possible!!!


As an aside, another thing seemed slightly weird to me. My colleagues 
are running an a 16 core Linux box and the worker threads are set to 17 
as expected however despite running with I think 8 producers and 32 
consumers the CPU usage reported by top maxes out at 113% this seems 
massively low on a 16 core box and I'd have hoped to see a massively 
higher message rate than they are actually seeing and the CPU usage 
getting closer to 1600%. Is there something "special" that needs to be 
done to make best use out of a nice big multicore Xeon box. IIRC the MRG 
whitepaper mentions "Use taskset to start qpid-daemon on all cpus". This 
isn't something I'm familiar with but looks like it relates to CPU 
affinity, but to my mind that doesn't account for maxing out at only a 
fraction of the available CPU capacity (it's not network bound BTW).


Are there any tutorials on how to obtain the absolute maximum super 
turbo message throughput :-) We're not even coming *close* to the 
figures quoted in the MRG whitepaper despite running of more powerful 
hardware, so I'm assuming we're doing something wrong unless the MRG 
figures are massively exaggerated.


Many thanks
Frase






---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Strategies for managing and monitoring large Qpid topologies in mission critical systems

Posted by Fraser Adams <fr...@blueyonder.co.uk>.
Hi David,
I think it's only the Java broker that exposes JMX attributes at the 
moment.

When I (eventually - hopefully next couple of weeks if I don't get 
distracted!!) get the Java QMF2 API done I'm hoping that'll be a good 
starting point for exposing QMF2 Management Objects as MBeans.

I believe that there is something called QMan mentioned in the Wiki that 
"is a tool that dynamically reads the QMF Schema information and creates 
JMX objects that consumed by any JMX console or application server to 
manage Qpid". I never got very far with that nor with the Java QMF1 
Console, which appeared fairly broken when I tried it - certainly with 
respect to the C++ broker (a lot of the brokenness was actually my old 
friend byte[] where String was expected :-)) so I gave up on that and 
started down the QMF2 path.

On the general topic of Management I was only really using OpenView as 
an example - I remain really interested in what people are using in 
practice to manage and monitor their broker networks. I'm looking into 
Cumin which looks quite nice. Does anyone know of nice eye-candy for 
visualising federated broker topologies - or am I going to have to write 
one :-)

Frase


David Karlsen wrote:
> OpenView has an JMX agent - that might be worthwhile?
> Seems like Qpid exposes quite a number of JMX attributes:
> https://cwiki.apache.org/qpid/qpid-jmx-management-console-user-guide.html
>
> 2011/9/21 Fraser Adams <fr...@blueyonder.co.uk>
>
>   
>> Hello all,
>> I'm seeking thoughts from those in the community who have been using Qpid
>> in mission critical systems.
>>
>> So from my observations Qpid seems pretty stable, but there's alway the
>> possibility of exciting little gotchas especially as the complexity grows
>> and one starts to use fairly complex federated topologies of large numbers
>> of brokers. As an example in the early days we got bitten a lot by "blown"
>> links when consumers went down and queues filled to capacity. We're working
>> past that with queue routes/circular queues/servers with largish memory etc.
>>
>> Despite all that I'm expecting some gotchas so I want to turn my attention
>> to managing/monitoring/system health check.
>>
>> So what sort of things are others in similar positions using?
>>
>> The core tools qpid-config, qpid-route, qpid-stat etc. are very useful but
>> they are quite "mandraulic" so are people manually using those and
>> reactively solving problems or is there much in the way of proactive
>> management (fixing things before the clients shout :-) Are people scripting
>> the core tools or writing their own stuff? I've been doing a lot with QMF2
>> lately and there's clearly a huge potential with that, but I don't want to
>> go about reinventing wheels.
>>
>> Has anyone else integrated Qpid with Enterprise System Management tools
>> such as HP OpenView/OperationsManager? If so are they writing bespoke QMF
>> Console apps or scripting things like qpid-stat, qpid-printevents etc.
>>
>> I'd be interested to know what the recommended approaches are and whether
>> there are any "sister projects" looking into this - I'd like to keep things
>> cohesive with best practice and I'd like to avoid going down divergent
>> paths.
>>
>> Hope to hear from you
>> Cheers
>> Frase
>>
>> ------------------------------**------------------------------**---------
>> Apache Qpid - AMQP Messaging Implementation
>> Project:      http://qpid.apache.org
>> Use/Interact: mailto:users-subscribe@qpid.**apache.org<us...@qpid.apache.org>
>>
>>
>>     
>
>
>   


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Strategies for managing and monitoring large Qpid topologies in mission critical systems

Posted by David Karlsen <da...@gmail.com>.
OpenView has an JMX agent - that might be worthwhile?
Seems like Qpid exposes quite a number of JMX attributes:
https://cwiki.apache.org/qpid/qpid-jmx-management-console-user-guide.html

2011/9/21 Fraser Adams <fr...@blueyonder.co.uk>

> Hello all,
> I'm seeking thoughts from those in the community who have been using Qpid
> in mission critical systems.
>
> So from my observations Qpid seems pretty stable, but there's alway the
> possibility of exciting little gotchas especially as the complexity grows
> and one starts to use fairly complex federated topologies of large numbers
> of brokers. As an example in the early days we got bitten a lot by "blown"
> links when consumers went down and queues filled to capacity. We're working
> past that with queue routes/circular queues/servers with largish memory etc.
>
> Despite all that I'm expecting some gotchas so I want to turn my attention
> to managing/monitoring/system health check.
>
> So what sort of things are others in similar positions using?
>
> The core tools qpid-config, qpid-route, qpid-stat etc. are very useful but
> they are quite "mandraulic" so are people manually using those and
> reactively solving problems or is there much in the way of proactive
> management (fixing things before the clients shout :-) Are people scripting
> the core tools or writing their own stuff? I've been doing a lot with QMF2
> lately and there's clearly a huge potential with that, but I don't want to
> go about reinventing wheels.
>
> Has anyone else integrated Qpid with Enterprise System Management tools
> such as HP OpenView/OperationsManager? If so are they writing bespoke QMF
> Console apps or scripting things like qpid-stat, qpid-printevents etc.
>
> I'd be interested to know what the recommended approaches are and whether
> there are any "sister projects" looking into this - I'd like to keep things
> cohesive with best practice and I'd like to avoid going down divergent
> paths.
>
> Hope to hear from you
> Cheers
> Frase
>
> ------------------------------**------------------------------**---------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.**apache.org<us...@qpid.apache.org>
>
>


-- 
--
David J. M. Karlsen - http://www.linkedin.com/in/davidkarlsen