You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Jim Lloyd <jl...@silvertailsystems.com> on 2008/12/04 20:55:16 UTC

High throughput non-persistent pub/sub, and problems with ActiveMQ-CPP 2.2.2

Hello,

I have experience with very high volume pub/sub using Tibco Rendezvous
(multicast) for an internal monitoring & business analytics system that I
led the development of at eBay. That system routinely had over 1Gbps of data
in flight on the datacenter's GigE network, with dozens of blade servers
publishing, and even more blade servers subscribing.

I'm now at a different company, and we're building products that will have a
similar architecture, though likely more modest data volumes. We're using
the ActiveMQ 5.2.0 release and ActiveMQ-CPP 2.2.2 release. I'm still coming
up to speed on the ActiveMQ architecture, configuration, tools, etc. Over
the last couple weeks I've modified the TopicPublisher and TopicListener
examples to determine what level of throughput can be obtained.

My modified TopicPublisher spins up multiple connections, each connection
publishing to multiple topics. The messages published are BytesMessages that
simply have an array of 1000 random bytes. I use a
ScheduledExecutorService.scheduleAtFixedRate() to run tasks that are
triggered every 10 milliseconds. The tasks send a burst of messages. The
number of messages in the burst is computed to achieve a desired aggregate
bandwidth of data published, specified in Megabits per second.

I was very pleased to find that with the servers I have available for
testing (8-core 1.6Ghz Xeons with 8Gb RAM running CentOS 5.2) that I was
able to sustain about 500Mbps of physical data (i.e. including TCP header
and OpenWire overhead) from one publisher, through one broker, to one
listener, and run this test for hours without any problems. (For those used
to thinking in terms of messages per second, this is 50K messages/second
with 1K byte messages.) Even better, I can add a second listener, connecting
to the broker on a 2nd ethernet interface, such that the broker was
delivering a total of ~1Gbps of data to the two listeners. This is excellent
performance and gave me a great deal of confidence that we could use
ActiveMQ for our products.

However, I am now trying to write a listener using ActiveMQ-CPP 2.2.2, and
finding that it can't even come close to achieving the throughput that the
Java listener achieves. I started with the SimpleAsyncConsumer sample and
modified it to spin up multiple connections, with each connection
subscribing to a different topic (equivalent to my modified java
TopicListener). The only thing this application does is receive the messages
as fast as possible, and for each message use BytesMessage::getBodyLength()
to keep a running total of bytes received (again, equivalent to the java
listener).

So far, the C++ listener can only handle less than 1/4th of the volume of
data that the Java listener can handle. If I keep the data rate low enough,
the C++ listener seems to be able to run fine.  But when I push the data
rate up to 120Mbps, all three components (publisher, broker, listener)
freeze up in less than a half minute. The broker admin console shows greater
than 90% of the memory in use. Killing the listener and the publisher leaves
the broker in the same state, and so far the only solution I am aware of is
to kill and restart the broker.

I don't yet know if this is purely a "slow consumer" problem, or if the
consumer becomes "slow" because it deadlocks (I have a pstack output that
I'm going to study today and would be happy to make available). I suspect
the latter, since I haven't yet seen any indications of just "slow"
performance before the lockup happens (but I am not yet looking at advisory
messages, which I realize is a major oversight).

FYI, I am currently using the default configuration for the broker, but I do
the following at runtime to configure the pub/sub:

In the Java publisher:

   1. Sessions are created with AUTO_ACKNOWLEGE
   2. Delivery mody is NON_PERSISTENT
   3. Time to live is 10 seconds

In the Java TopicListener:

   1. Sessions are created with AUTO_ACKNOWLEGE
   2. Broker URI does not specify any parameters (i.e. do not specify
   jms.prefetchPolicy.all)
   3. Topic URIs do not specify any parameters (i.e. do not specify
   consumer.maximumPendingMessageLimit)

In the C++ Consumer:

   1. Sessions are created with AUTO_ACKNOWLEGE
   2. Broker URI includes "?jms.prefetchPolicy.all=2000"
   3. Topic URIs include "?consumer.maximumPendingMessageLimit=4000"

Note that while both the TopicListener and the SimpleAsyncConsumer used
asynchronous dispatch, I have modified both to do synchronous receives in
their own threads. For the C++ consumer, this results in 3 threads per
connection, and I have been testing with 8 connections. One experiment I
want to do today is revert to asynchronous dispatch, assuming this will
bring me back to 2 threads per connection.

I still have other investigation that I want to do, and it is possible that
this investigation will result in being able to provide enough specifics to
file a bug report. It's also possible that I'll find that I've made some
newbie mistake. However, in some of my research I've done so far I've seen
indications that ActiveMQ-CPP 2.2.1 had known problems similar to these, and
at least one known deadlock related to CmsTemplate still exists in the 2.2.2
release.

I am writing because I would appreciate help from AMQ developers or any
experienced users in the AMQ community who would be interested in checking
my work to rule out newbie mistakes. I would be happy to make the source
code for my modified examples available to anyone that is interested.

Some questions I would like to ask here: What is the right way to configure
publishers, brokers, and listeners for high volumes of messages when some
data loss is considered entirely acceptable? Suppose a system is allowed to
have only two nines (99.0%) SLA (measure monthly) for message delivery if
that is required to achieve high stability? Can the broker be configured
such that it will never deadlock even if a consumer deadlocks?

Thanks,
Jim Lloyd
Principal Architect
Silver Tail Systems Inc.

Re: High throughput non-persistent pub/sub, and problems with ActiveMQ-CPP 2.2.2

Posted by Jim Lloyd <jl...@silvertailsystems.com>.
Thanks Tim. I'm in the process of subscribing to advisories. I'd like to
have done at least that before I post the code, but I hope to do that before
the end of the day today.

-Jim

On Thu, Dec 4, 2008 at 1:13 PM, Timothy Bish <ta...@gmail.com> wrote:

> Feel free to post or send any code that you would like reviewed in
> regards to the modified CPP examples, I will take a look at let you know
> if I see any obvious gotchas.  The deadlock that is currently known
> seems to only crop up when using CmsTemplate and only at shutdown, so if
> it is deadlocking on high volume its probably something new that we
> haven't seen yet.
>
> Obviously if you can come up with some samples that can lock up the
> client those would be invaluable in finding the root cause.
>
> Regards
> Tim
>
> On Thu, 2008-12-04 at 11:55 -0800, Jim Lloyd wrote:
> > Hello,
> >
> > I have experience with very high volume pub/sub using Tibco Rendezvous
> > (multicast) for an internal monitoring & business analytics system that I
> > led the development of at eBay. That system routinely had over 1Gbps of
> data
> > in flight on the datacenter's GigE network, with dozens of blade servers
> > publishing, and even more blade servers subscribing.
> >
> > I'm now at a different company, and we're building products that will
> have a
> > similar architecture, though likely more modest data volumes. We're using
> > the ActiveMQ 5.2.0 release and ActiveMQ-CPP 2.2.2 release. I'm still
> coming
> > up to speed on the ActiveMQ architecture, configuration, tools, etc. Over
> > the last couple weeks I've modified the TopicPublisher and TopicListener
> > examples to determine what level of throughput can be obtained.
> >
> > My modified TopicPublisher spins up multiple connections, each connection
> > publishing to multiple topics. The messages published are BytesMessages
> that
> > simply have an array of 1000 random bytes. I use a
> > ScheduledExecutorService.scheduleAtFixedRate() to run tasks that are
> > triggered every 10 milliseconds. The tasks send a burst of messages. The
> > number of messages in the burst is computed to achieve a desired
> aggregate
> > bandwidth of data published, specified in Megabits per second.
> >
> > I was very pleased to find that with the servers I have available for
> > testing (8-core 1.6Ghz Xeons with 8Gb RAM running CentOS 5.2) that I was
> > able to sustain about 500Mbps of physical data (i.e. including TCP header
> > and OpenWire overhead) from one publisher, through one broker, to one
> > listener, and run this test for hours without any problems. (For those
> used
> > to thinking in terms of messages per second, this is 50K messages/second
> > with 1K byte messages.) Even better, I can add a second listener,
> connecting
> > to the broker on a 2nd ethernet interface, such that the broker was
> > delivering a total of ~1Gbps of data to the two listeners. This is
> excellent
> > performance and gave me a great deal of confidence that we could use
> > ActiveMQ for our products.
> >
> > However, I am now trying to write a listener using ActiveMQ-CPP 2.2.2,
> and
> > finding that it can't even come close to achieving the throughput that
> the
> > Java listener achieves. I started with the SimpleAsyncConsumer sample and
> > modified it to spin up multiple connections, with each connection
> > subscribing to a different topic (equivalent to my modified java
> > TopicListener). The only thing this application does is receive the
> messages
> > as fast as possible, and for each message use
> BytesMessage::getBodyLength()
> > to keep a running total of bytes received (again, equivalent to the java
> > listener).
> >
> > So far, the C++ listener can only handle less than 1/4th of the volume of
> > data that the Java listener can handle. If I keep the data rate low
> enough,
> > the C++ listener seems to be able to run fine.  But when I push the data
> > rate up to 120Mbps, all three components (publisher, broker, listener)
> > freeze up in less than a half minute. The broker admin console shows
> greater
> > than 90% of the memory in use. Killing the listener and the publisher
> leaves
> > the broker in the same state, and so far the only solution I am aware of
> is
> > to kill and restart the broker.
> >
> > I don't yet know if this is purely a "slow consumer" problem, or if the
> > consumer becomes "slow" because it deadlocks (I have a pstack output that
> > I'm going to study today and would be happy to make available). I suspect
> > the latter, since I haven't yet seen any indications of just "slow"
> > performance before the lockup happens (but I am not yet looking at
> advisory
> > messages, which I realize is a major oversight).
> >
> > FYI, I am currently using the default configuration for the broker, but I
> do
> > the following at runtime to configure the pub/sub:
> >
> > In the Java publisher:
> >
> >    1. Sessions are created with AUTO_ACKNOWLEGE
> >    2. Delivery mody is NON_PERSISTENT
> >    3. Time to live is 10 seconds
> >
> > In the Java TopicListener:
> >
> >    1. Sessions are created with AUTO_ACKNOWLEGE
> >    2. Broker URI does not specify any parameters (i.e. do not specify
> >    jms.prefetchPolicy.all)
> >    3. Topic URIs do not specify any parameters (i.e. do not specify
> >    consumer.maximumPendingMessageLimit)
> >
> > In the C++ Consumer:
> >
> >    1. Sessions are created with AUTO_ACKNOWLEGE
> >    2. Broker URI includes "?jms.prefetchPolicy.all=2000"
> >    3. Topic URIs include "?consumer.maximumPendingMessageLimit=4000"
> >
> > Note that while both the TopicListener and the SimpleAsyncConsumer used
> > asynchronous dispatch, I have modified both to do synchronous receives in
> > their own threads. For the C++ consumer, this results in 3 threads per
> > connection, and I have been testing with 8 connections. One experiment I
> > want to do today is revert to asynchronous dispatch, assuming this will
> > bring me back to 2 threads per connection.
> >
> > I still have other investigation that I want to do, and it is possible
> that
> > this investigation will result in being able to provide enough specifics
> to
> > file a bug report. It's also possible that I'll find that I've made some
> > newbie mistake. However, in some of my research I've done so far I've
> seen
> > indications that ActiveMQ-CPP 2.2.1 had known problems similar to these,
> and
> > at least one known deadlock related to CmsTemplate still exists in the
> 2.2.2
> > release.
> >
> > I am writing because I would appreciate help from AMQ developers or any
> > experienced users in the AMQ community who would be interested in
> checking
> > my work to rule out newbie mistakes. I would be happy to make the source
> > code for my modified examples available to anyone that is interested.
> >
> > Some questions I would like to ask here: What is the right way to
> configure
> > publishers, brokers, and listeners for high volumes of messages when some
> > data loss is considered entirely acceptable? Suppose a system is allowed
> to
> > have only two nines (99.0%) SLA (measure monthly) for message delivery if
> > that is required to achieve high stability? Can the broker be configured
> > such that it will never deadlock even if a consumer deadlocks?
> >
> > Thanks,
> > Jim Lloyd
> > Principal Architect
> > Silver Tail Systems Inc.
>
>

Re: High throughput non-persistent pub/sub, and problems with ActiveMQ-CPP 2.2.2

Posted by Timothy Bish <ta...@gmail.com>.
Feel free to post or send any code that you would like reviewed in
regards to the modified CPP examples, I will take a look at let you know
if I see any obvious gotchas.  The deadlock that is currently known
seems to only crop up when using CmsTemplate and only at shutdown, so if
it is deadlocking on high volume its probably something new that we
haven't seen yet.  

Obviously if you can come up with some samples that can lock up the
client those would be invaluable in finding the root cause.  

Regards
Tim

On Thu, 2008-12-04 at 11:55 -0800, Jim Lloyd wrote:
> Hello,
> 
> I have experience with very high volume pub/sub using Tibco Rendezvous
> (multicast) for an internal monitoring & business analytics system that I
> led the development of at eBay. That system routinely had over 1Gbps of data
> in flight on the datacenter's GigE network, with dozens of blade servers
> publishing, and even more blade servers subscribing.
> 
> I'm now at a different company, and we're building products that will have a
> similar architecture, though likely more modest data volumes. We're using
> the ActiveMQ 5.2.0 release and ActiveMQ-CPP 2.2.2 release. I'm still coming
> up to speed on the ActiveMQ architecture, configuration, tools, etc. Over
> the last couple weeks I've modified the TopicPublisher and TopicListener
> examples to determine what level of throughput can be obtained.
> 
> My modified TopicPublisher spins up multiple connections, each connection
> publishing to multiple topics. The messages published are BytesMessages that
> simply have an array of 1000 random bytes. I use a
> ScheduledExecutorService.scheduleAtFixedRate() to run tasks that are
> triggered every 10 milliseconds. The tasks send a burst of messages. The
> number of messages in the burst is computed to achieve a desired aggregate
> bandwidth of data published, specified in Megabits per second.
> 
> I was very pleased to find that with the servers I have available for
> testing (8-core 1.6Ghz Xeons with 8Gb RAM running CentOS 5.2) that I was
> able to sustain about 500Mbps of physical data (i.e. including TCP header
> and OpenWire overhead) from one publisher, through one broker, to one
> listener, and run this test for hours without any problems. (For those used
> to thinking in terms of messages per second, this is 50K messages/second
> with 1K byte messages.) Even better, I can add a second listener, connecting
> to the broker on a 2nd ethernet interface, such that the broker was
> delivering a total of ~1Gbps of data to the two listeners. This is excellent
> performance and gave me a great deal of confidence that we could use
> ActiveMQ for our products.
> 
> However, I am now trying to write a listener using ActiveMQ-CPP 2.2.2, and
> finding that it can't even come close to achieving the throughput that the
> Java listener achieves. I started with the SimpleAsyncConsumer sample and
> modified it to spin up multiple connections, with each connection
> subscribing to a different topic (equivalent to my modified java
> TopicListener). The only thing this application does is receive the messages
> as fast as possible, and for each message use BytesMessage::getBodyLength()
> to keep a running total of bytes received (again, equivalent to the java
> listener).
> 
> So far, the C++ listener can only handle less than 1/4th of the volume of
> data that the Java listener can handle. If I keep the data rate low enough,
> the C++ listener seems to be able to run fine.  But when I push the data
> rate up to 120Mbps, all three components (publisher, broker, listener)
> freeze up in less than a half minute. The broker admin console shows greater
> than 90% of the memory in use. Killing the listener and the publisher leaves
> the broker in the same state, and so far the only solution I am aware of is
> to kill and restart the broker.
> 
> I don't yet know if this is purely a "slow consumer" problem, or if the
> consumer becomes "slow" because it deadlocks (I have a pstack output that
> I'm going to study today and would be happy to make available). I suspect
> the latter, since I haven't yet seen any indications of just "slow"
> performance before the lockup happens (but I am not yet looking at advisory
> messages, which I realize is a major oversight).
> 
> FYI, I am currently using the default configuration for the broker, but I do
> the following at runtime to configure the pub/sub:
> 
> In the Java publisher:
> 
>    1. Sessions are created with AUTO_ACKNOWLEGE
>    2. Delivery mody is NON_PERSISTENT
>    3. Time to live is 10 seconds
> 
> In the Java TopicListener:
> 
>    1. Sessions are created with AUTO_ACKNOWLEGE
>    2. Broker URI does not specify any parameters (i.e. do not specify
>    jms.prefetchPolicy.all)
>    3. Topic URIs do not specify any parameters (i.e. do not specify
>    consumer.maximumPendingMessageLimit)
> 
> In the C++ Consumer:
> 
>    1. Sessions are created with AUTO_ACKNOWLEGE
>    2. Broker URI includes "?jms.prefetchPolicy.all=2000"
>    3. Topic URIs include "?consumer.maximumPendingMessageLimit=4000"
> 
> Note that while both the TopicListener and the SimpleAsyncConsumer used
> asynchronous dispatch, I have modified both to do synchronous receives in
> their own threads. For the C++ consumer, this results in 3 threads per
> connection, and I have been testing with 8 connections. One experiment I
> want to do today is revert to asynchronous dispatch, assuming this will
> bring me back to 2 threads per connection.
> 
> I still have other investigation that I want to do, and it is possible that
> this investigation will result in being able to provide enough specifics to
> file a bug report. It's also possible that I'll find that I've made some
> newbie mistake. However, in some of my research I've done so far I've seen
> indications that ActiveMQ-CPP 2.2.1 had known problems similar to these, and
> at least one known deadlock related to CmsTemplate still exists in the 2.2.2
> release.
> 
> I am writing because I would appreciate help from AMQ developers or any
> experienced users in the AMQ community who would be interested in checking
> my work to rule out newbie mistakes. I would be happy to make the source
> code for my modified examples available to anyone that is interested.
> 
> Some questions I would like to ask here: What is the right way to configure
> publishers, brokers, and listeners for high volumes of messages when some
> data loss is considered entirely acceptable? Suppose a system is allowed to
> have only two nines (99.0%) SLA (measure monthly) for message delivery if
> that is required to achieve high stability? Can the broker be configured
> such that it will never deadlock even if a consumer deadlocks?
> 
> Thanks,
> Jim Lloyd
> Principal Architect
> Silver Tail Systems Inc.