You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@activemq.apache.org by Sean Bastille <se...@gmail.com> on 2008/02/06 19:08:56 UTC

High message rate research

Hello,

I'm investigating using ActiveMQ for in an application with a very high
transaction rate. This is actually a follow up to a thread that Marc
started up a few months ago,
http://www.nabble.com/Questions-on-Network-of-Brokers-and-high-message-rates-to14145093s2354.html
and all of that discussion fully applies here. The quick summary is:
- Large deployment, expect at least 100 hosts.
- ~200k messages/sec to multiple destinations
- Main concerns are scalability and high-availability, without persistence.
- We do not need guaranteed delivery, 99.9999% is good enough.

>From the research I've done, the only solution that seems viable is
configuring a network of embedded brokers. Hub/spoke doesn't scale the way
we need it to, and a regular network of brokers seems to have too much risk
of hot spots. That last statement I'm sure is debatable, as technically
there is nothing different from what I am planning and what a network of
brokers is, however factors such as configuration management make embedding
the broker easier to conceptualize and manage than running many standalone
brokers.

Assuming a network of embedded brokers, the first part of the topology is
expected to be 50 producer processes, each with an embedded broker, with
each broker connected to each of 20 consumer processes, and each consumer
subscribing to the same distributed queue. That's 1,000 network
connections, but distributed across 70 hosts, with each broker really having
no more than 50 connections it needs to worry about.

While trying to test this would actually work with ActiveMQ, I found a
couple of problems, each with a work around. These were tested with both
5.0.0 and 5.1-SNAPSHOT Java API.

1) synchronizing on a BrokerService while calling
ConnectionFactory.createConnection().start() will cause a deadlock. I
wouldn't say this is a bug, just notable unexpected behavior.

2) Calling Connection.start() after Broker.start() has already been called
leaves a connection with a default broker name. This probably takes a bit
longer to explain.

So in my testing of the above configuration, I am expecting a dynamic set of
connections to the 20 consumers. The consumer complex may need to
grow, or get migrated to a different set of hosts, so I'm going to need to
add and remove broker links at runtime. The main design is that each broker
has a listen port, used or not, and connects to each broker that has a
locally attached consumer that it is interested in. To support this, I am
adding links after the broker has been started, then calling
connection.start(), and when I have multiple producers connect to a single
consumer, the first connection works fine, but the second connection throws
a

javax.jms.InvalidClientIDException: Broker: localhost-61616 - Client:
NC_localhost_outbound already connected from /127.0.0.1:1472

Unfortunately the stacktrace isn't helpful because this exception is being
thrown by the consumer, and the bug is actually in the producer.
NC_localhost_outbound should have actually been named
NC_localhost-61618_outbound, but when
DemandForwardingBridgeSupport.startRemoteBridge() calls
configuration.getBrokerName(), it returns the default broker name..

Broker.start() calls connection.setBrokerName before calling
connection.start(), but I am adding connections after Broker.start() has
been called, I can get around this by calling connection.setBrokerNamedirectly.

-- End of 2

So far I've only tested 1->many and many->1, and they both seem to work now,
my next test will be many->many, and I'll let you know how that goes, but
this is really just one part of a larger problem.

>From the example above, we can refer to the 50 producers as Complex A, and
the 20 consumers and Complex B. There is also a complex C of about 20
processes, and a complex D of about 20 processes. C is a consumer of B, and
D is a consumer of both A and C.

As for message rates:
A -> B: ~200k/sec
A -> D: ~200k/sec
B -> C: ~200k/sec
C -> D: 30M in one burst every hour - can run slower if necessary.

My real question is, given what I've described about this setup so far, will
it work? Should I expect to run into any circular or inefficient routing
problems? Any other problems?

The first step in this will actually be a bridge between our current system
and this JMS based solution, leveraging the c++ api. I've read through the
thread from Hellweek (
http://www.nabble.com/ActiveMQ-thoughts-to14262131s2354.html). Were the
problems he found confined to c++ working with c#, or may there also be
problems using c++ with Java?

Thanks in advance,

Sean Bastille

Re: High message rate research

Posted by Sean Bastille <se...@gmail.com>.

Hi Joe,

I didn't pay too much attention to message size.  In production our messages
are 34 bytes plus some headers.  For this testing the average ended up being
26 bytes.

While testing the standalone broker, yes it was a completely default
configuration.

For the embedded brokers, I did not use any config files, so it would be
whatever the code defaulted to.  I set the following:
- Used nio instead of tcp (there was a performance gain)
- deliveryMode NON_PERSISTENT
- AUTO_ACKNOWLEDGE
- consumer.prefetchSize=1

I had forgotten about the prefetch size, so I just ran a 1:1 embedded test
with prefetchSize set to 1000.  No change in throughput.

Thanks,

Sean

On Feb 11, 2008 1:15 PM, ttmdev <jo...@ttmsolutions.com> wrote:

>
> Hi Sean,
>
> Curious to know the average message size used and whether you used the
> default broker configuration?
>
> Thanks,
> Joe
> www.ttmsolutions.com
>
>
> Sean Bastille-2 wrote:
> >
> > I guess my first message was too long, but that's alright, I ended up
> > getting the answers I needed.  I figured I should let you guys in on the
> > results of my testing.
> >
> > The summary of my initial write up is that my main requirement is
> message
> > throughput per host, so that is what my testing has been focused.  The
> > bottom line is that I was able to sent 4500 messages/sec between two
> hosts
> > using 3 producer processes and 2 consumer processes each using embedded
> > brokers.  That is the highest sustained throughput I was able to
> achieve,
> > and while it was successful, the 2 consumers were consuming ~80% of the
> > CPU
> > on the host (2 2.4G Xeons with HT enabled) leaving little available for
> my
> > processing of the messages.  Interestingly enough, the 3 producers were
> > consuming ~60% of the CPU on the other host.
> >
> > As a sanity check, I compared a standalone broker to embedded brokers in
> a
> > 1:1 configuration.  The standalone maxed out around 1800/sec, and the
> > embedded brokers sustained 2500/sec, so this looks like it might be a
> more
> > capable configuration.
> >
> > Unfortunately, none of this works for our requirements.  To handle our
> > current load on existing hardware we need to support at least 8000/sec,
> > and
> > to allow for future growth, I'd really want 15-20k.
> >
> > -Sean
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/High-message-rate-research-tp15318063s2354p15417390.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
>

Re: High message rate research

Posted by ttmdev <jo...@ttmsolutions.com>.

Hi Sean,

Curious to know the average message size used and whether you used the
default broker configuration?

Thanks,
Joe
www.ttmsolutions.com


Sean Bastille-2 wrote:
> 
> I guess my first message was too long, but that's alright, I ended up
> getting the answers I needed.  I figured I should let you guys in on the
> results of my testing.
> 
> The summary of my initial write up is that my main requirement is message
> throughput per host, so that is what my testing has been focused.  The
> bottom line is that I was able to sent 4500 messages/sec between two hosts
> using 3 producer processes and 2 consumer processes each using embedded
> brokers.  That is the highest sustained throughput I was able to achieve,
> and while it was successful, the 2 consumers were consuming ~80% of the
> CPU
> on the host (2 2.4G Xeons with HT enabled) leaving little available for my
> processing of the messages.  Interestingly enough, the 3 producers were
> consuming ~60% of the CPU on the other host.
> 
> As a sanity check, I compared a standalone broker to embedded brokers in a
> 1:1 configuration.  The standalone maxed out around 1800/sec, and the
> embedded brokers sustained 2500/sec, so this looks like it might be a more
> capable configuration.
> 
> Unfortunately, none of this works for our requirements.  To handle our
> current load on existing hardware we need to support at least 8000/sec,
> and
> to allow for future growth, I'd really want 15-20k.
> 
> -Sean
> 
> 

-- 
View this message in context: http://www.nabble.com/High-message-rate-research-tp15318063s2354p15417390.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: High message rate research

Posted by Sean Bastille <se...@gmail.com>.

I guess my first message was too long, but that's alright, I ended up
getting the answers I needed.  I figured I should let you guys in on the
results of my testing.

The summary of my initial write up is that my main requirement is message
throughput per host, so that is what my testing has been focused.  The
bottom line is that I was able to sent 4500 messages/sec between two hosts
using 3 producer processes and 2 consumer processes each using embedded
brokers.  That is the highest sustained throughput I was able to achieve,
and while it was successful, the 2 consumers were consuming ~80% of the CPU
on the host (2 2.4G Xeons with HT enabled) leaving little available for my
processing of the messages.  Interestingly enough, the 3 producers were
consuming ~60% of the CPU on the other host.

As a sanity check, I compared a standalone broker to embedded brokers in a
1:1 configuration.  The standalone maxed out around 1800/sec, and the
embedded brokers sustained 2500/sec, so this looks like it might be a more
capable configuration.

Unfortunately, none of this works for our requirements.  To handle our
current load on existing hardware we need to support at least 8000/sec, and
to allow for future growth, I'd really want 15-20k.

-Sean