You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@activemq.apache.org by uromahn <ul...@ulrichromahn.net> on 2014/10/16 16:42:39 UTC

Potential Bug in Master-Slave with Replicated LevelDB Store

I believe I may have found a bug here. However, this could also be a
mis-configuration on my side.

Before I go into a detailed description of my observation, here my setup:

I have setup the following system in a virtual environment:
I have 3 zookeeper nodes.
I have 6 ActiveMQ broker using 5.10.0
All nodes (ZK, AMQ) are running on CentOS 6.5 64bit with the latest OpenJDK.
Three broker form an active/passive cluster using replicated LevelDB store.
I have installed native LevelDB 1,7.0 accessing it via the JNDI driver.
The two cluster are forming a network of broker.
The "networkConnectors" are defined in the activemq.xml files in only one
cluster as duplex connections.

Here is my test case and the situation:
Let's name the six broker amq1 - amq6. So, the first active/passive cluster
is amq1, amq2, and amq3 with amq1 active and the other two passive. The
second cluster consists of amq4, amq5, and amq6 with amq4 as the active.
I have one producer connecting to amq1 publishing messages to a VirtualTopic
"VirtualTopic.Test" and a consumer connecting to amq4 reading those messages
from a corresponding queue "Consumer.A.VirtualTopic.Test".
In my test, I am sending 100,000 text messages with a body consisting of
1024 random characters to the VirtualTopic at maximum speed.
However, after about 25,000 to 27,000 messages, the consumer on amq4 times
out after about 10 seconds not receiving any more messages although the
producer has already send all 100,000 messages to amq1.
When looking at the log file of amq4, I am seeing the following messages:
....
2014-10-16 12:53:12,556 | INFO | Started responder end of duplex bridge
link2@ID:uromahn-amq1-9110-48269-1413463991773-0:1 |
org.apache.activemq.broker.TransportConnection | ActiveMQ Transport:
tcp:///xx.xx.xx.xx:58959@61616
2014-10-16 12:53:12,559 | INFO | Started responder end of duplex bridge
link1@ID:uromahn-amq1-9110-48269-1413463991773-0:1 |
org.apache.activemq.broker.TransportConnection | ActiveMQ Transport:
tcp:///xx.xx.xx.xx:58958@61616
2014-10-16 12:53:12,591 | INFO | Network connection between
vm://brokergrp2#2 and tcp:///xx.xx.xx.xx:58958@61616 (brokergrp1) has been
established. | org.apache.activemq.network.DemandForwardingBridgeSupport |
triggerStartAsyncNetworkBridgeCreation:
remoteBroker=tcp:///xx.xx.xx.xx:58958@61616, localBroker= vm://brokergrp2#2
2014-10-16 12:53:12,591 | INFO | Network connection between
vm://brokergrp2#0 and tcp:///10.64.253.198:58959@61616 (brokergrp1) has been
established. | org.apache.activemq.network.DemandForwardingBridgeSupport |
triggerStartAsyncNetworkBridgeCreation:
remoteBroker=tcp:///xx.xx.xx.xx:58959@61616, localBroker= vm://brokergrp2#0
2014-10-16 13:00:10,470 | INFO | Client session timed out, have not heard
from server in 4071ms for sessionid 0x14918fd40f00007, closing socket
connection and attempting reconnect | org.apache.zookeeper.ClientCnxn |
main-SendThread(uromahn-zk1-9208:2181)
2014-10-16 13:00:10,575 | INFO | Demoted to slave |
org.apache.activemq.leveldb.replicated.MasterElector | ZooKeeper state
change dispatcher thread
2014-10-16 13:00:10,582 | INFO | Apache ActiveMQ 5.10.0 (brokergrp2,
ID:uromahn-amq4-9175-46383-1413463934439-0:1) is shutting down |
org.apache.activemq.broker.BrokerService | ActiveMQ
BrokerService[brokergrp2] Task-8
2014-10-16 13:00:10,594 | WARN | Transport Connection to:
tcp://zz.zz.zz.zz:34737 failed: java.io.IOException: Unexpected error
occured: org.apache.activemq.broker.BrokerStoppedException: Broker
BrokerService[brokergrp2] is being stopped |
org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
Transport: tcp:///zz.zz.zz.zz:34737@61616
....

After that, the log file contains a large number of messages dumped for
which the broker could not send the acknowledgement back to amq1.

Looking at the log file of the newly promoted master amq5 I see the
following warning:
....
2014-10-16 13:00:18,422 | INFO | Network connection between
vm://brokergrp2#0 and tcp:///yy.yy.yy.yy:58256@61616 (brokergrp1) has been
established. | org.apache.activemq.network.DemandForwardingBridgeSupport |
triggerStartAsyncNetworkBridgeCreation:
remoteBroker=tcp:///yy.yy.yy.yy:58256@61616, localBroker= vm://brokergrp2#0
2014-10-16 13:32:20,285 | WARN | Unexpected session error:
java.io.IOException: Connection reset by peer |
org.apache.activemq.leveldb.replicated.MasterLevelDBStore |
hawtdispatch-DEFAULT-2
2014-10-16 13:32:20,286 | INFO | Slave has disconnected:
e21da7b3-65a4-4756-a20b-bd77f8eb7607 |
org.apache.activemq.leveldb.replicated.MasterLevelDBStore |
hawtdispatch-DEFAULT-2
....

After that, there are no messages being transmitted from amq1 to the second
cluster in the network (neither amq4 nor amq5). This can be seen when
looking at the corresponding tab of the web console of amq1. Also, when I
let all broker running, I am getting a message on amq4 that the broker is
waiting for the replicated store to synchronize one record. (or something to
this extend - unfortunately I deleted the log already. :( ).

So, here is what appears to happen:
somehow Zookeeper things that amq4 is "dead" and then demotes it to slave in
the middle of the message transfer. This could be because amq4 is too busy
to respond to heatbeat requests. However, something goes wrong in the
LevelDB replication and the new master amq5 can't pick up. This seem to lead
to a "dead-lock" situation causing the client to time-out.
NOTE: all clients are using the failover transport to connect to the
cluster. The connector between the two cluster is also defined using the
failover transport.
Another indication of something "bad" is the fact that the client reports an
exception that the connection to amq4 was reset and it is trying to
reconnect. However, that connection does not seem to happen within 10
seconds because my message listener has an internal timeout of 10 seconds.
I have double and triple-checked the configuration and everything looks ok.
Also, this doesn't happen in case both producer and consumer connect to the
same cluster, e.g. amq1 or amq4. So, the individual cluster seem to work.
The issue only seem to appear when transmitting messages between clusters in
a network of broker.

Any idea if this is a bug or just a SUE (stupid user error). :)

-Uli

--
View this message in context: http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by Tim Bain <tb...@alumni.duke.edu>.

I spent time last week trying to tune the parallel GC to prevent any
objects from reaching OldGen once the broker was up and running in a steady
state, to try to avoid expensive full GCs.  My goal was zero full GCs for a
broker with 3-6 months of uptime, to prevent clients and other brokers from
failing over from one broker to another.

I increased the size of NewGen relative to OldGen, I increased the size of
Survivor relative to Eden, and I tweaked a few other settings, and I was
never able to avoid a slow stream of objects making it into OldGen that
were deemed dead by the time a full GC happened (usually because I
triggered it manually).  I was able to reduce the rate of object promotion
by about half, and full GCs would probably have been less painful when
OldGen is only 5-10% of the total heap, so the changes should have made
full GCs less frequent and less painful, but I wasn't able to eliminate
them entirely.

So I've given up on the parallel GC and I'm now tweaking G1 to make it
behave as we'd like, and so far the results are far more promising than
with the parallel GC.  So I second Ulrich's recommendation to use G1 rather
than parallel GC, even though the overhead of G1 is several times that of
the parallel GC, if you're more interested in avoiding occasional lengthy
pauses due to full GCs than in getting the highest possible throughput from
your broker.

On Tue, Oct 21, 2014 at 10:13 AM, Tim Bain <tb...@alumni.duke.edu> wrote:

> G1GC is great for reducing the duration of any single stop-the-world GC
> (and hence minimizing latency of any individual operation as well as
> avoiding timeouts), but the total time spent performing GCs (and hence the
> total amount of time the brokers are paused) is several times that of the
> parallel GC algorithm, based on some articles I read a couple weeks back.
> So although G1GC should work for a wide range (possibly all) of ActiveMQ
> memory usage patterns and may be the right option for you based on how your
> broker is used, you may get better overall throughput from sticking with
> ParallelGC but adjusting the ratio of YoungGen to OldGen to favor YoungGen
> (increasing the odds that a message gets GC'ed before it gets to OldGen)
> and the ratio of Eden to Survivor within YoungGen to favor Survivor (to
> increase the odds that a message can stick around in YoungGen long enough
> to die before it gets promoted to OldGen).  But you have to be confident
> that your usage patterns won't allow OldGen to fill during the life of your
> broker's uptime (whether that's hours or years), otherwise you'll end up
> doing a long full GC and you'd probably have been better off going with
> G1GC.
>
> For our broker, we expire undelivered messages quickly (under a minute),
> so in theory expanding both YoungGen and Survivor might prevent anything
> from getting into OldGen and thus prevent long full GCs.  I'm actually
> going to be doing this tuning this week, so I'll report out what I find,
> though obviously YMMV since everyone's message usage patterns are different.
>
> On Tue, Oct 21, 2014 at 5:25 AM, uromahn <ul...@ulrichromahn.net> wrote:
>
>> Another update:
>>
>> I ran the broker with the native Java LevelDB and found that I am still
>> seeing the Warnings in the log file as reported before.
>>
>> However, to my surprise the broker seem to perform better and even
>> slightly
>> faster! I always thought the native LevelDB should be faster but I guess
>> the
>> access via JNI may be less optimal than using an embedded Java (or Scala)
>> engine.
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450p4686583.html
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
>
>

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by Tim Bain <tb...@alumni.duke.edu>.

G1GC is great for reducing the duration of any single stop-the-world GC
(and hence minimizing latency of any individual operation as well as
avoiding timeouts), but the total time spent performing GCs (and hence the
total amount of time the brokers are paused) is several times that of the
parallel GC algorithm, based on some articles I read a couple weeks back.
So although G1GC should work for a wide range (possibly all) of ActiveMQ
memory usage patterns and may be the right option for you based on how your
broker is used, you may get better overall throughput from sticking with
ParallelGC but adjusting the ratio of YoungGen to OldGen to favor YoungGen
(increasing the odds that a message gets GC'ed before it gets to OldGen)
and the ratio of Eden to Survivor within YoungGen to favor Survivor (to
increase the odds that a message can stick around in YoungGen long enough
to die before it gets promoted to OldGen).  But you have to be confident
that your usage patterns won't allow OldGen to fill during the life of your
broker's uptime (whether that's hours or years), otherwise you'll end up
doing a long full GC and you'd probably have been better off going with
G1GC.

For our broker, we expire undelivered messages quickly (under a minute), so
in theory expanding both YoungGen and Survivor might prevent anything from
getting into OldGen and thus prevent long full GCs.  I'm actually going to
be doing this tuning this week, so I'll report out what I find, though
obviously YMMV since everyone's message usage patterns are different.

On Tue, Oct 21, 2014 at 5:25 AM, uromahn <ul...@ulrichromahn.net> wrote:

> Another update:
>
> I ran the broker with the native Java LevelDB and found that I am still
> seeing the Warnings in the log file as reported before.
>
> However, to my surprise the broker seem to perform better and even slightly
> faster! I always thought the native LevelDB should be faster but I guess
> the
> access via JNI may be less optimal than using an embedded Java (or Scala)
> engine.
>
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450p4686583.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by uromahn <ul...@ulrichromahn.net>.

Another update:

I ran the broker with the native Java LevelDB and found that I am still
seeing the Warnings in the log file as reported before.

However, to my surprise the broker seem to perform better and even slightly
faster! I always thought the native LevelDB should be faster but I guess the
access via JNI may be less optimal than using an embedded Java (or Scala)
engine.




--
View this message in context: http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450p4686583.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by uromahn <ul...@ulrichromahn.net>.

Quick update:

I have enabled G1GC for the JVM running the broker and since then had no
problem again. The master broker stays master even under very heavy load.

So, my suggestion and recommendation when using replicated LevelDB would be
to use the G1 garbage collector significantly reducing "stop-the-world" GCs
causing a time-out on the connection to Zookeeper which will ultimately
"demote" a broker to Slave without real reason.

However, while running my tests, I noticed the following "Warnings" in the
log files of both slaves:
2014-10-21 09:44:37,694 | WARN  | Invalid log position: 1569963680 |
org.apache.activemq.leveldb.LevelDBClient | Thread-2

There are probably hundreds of those (didn't actually count them) with
obviously changing log positions.

I will retry my tests by switching to the Java implementation of LevelDB.




--
View this message in context: http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450p4686580.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by uromahn <ul...@ulrichromahn.net>.

Based on your suggestion, I looked at the GC behavior of the JVM, and you
were 100% spot on. At the time amq1 gets "demoted" to slave which forces a
failover to amq2 there was a "stop-the-world" GC going on.

Also, I was able to make the failover work correctly with the second cluster
in the network.
In my first cluster consiting of amq1-3, I had my <networkConnectors>
section identical. Each defined connector has a different name as suggested
(I have defined 5 connectors to improve throughput by using 5 concurrent
connections). However, after the failover the "other side" (amq4) was
"complaining" that the connection already exists from amq1 and hence it
rejected the conneciton from amq2.
It looks that in case of such a failover, the connections from amq1 to amq4
won't get cleaned up.

The work-around and solution was to give *every* connection from each
cluster node (amq1, amq2, amq3) a unique name.
So the <networkConnectors> section on amq1 looks like this:
        <networkConnectors>
            <networkConnector name="link1a" duplex="true"
conduitSubscriptions="false" decreaseNetworkConsumerPriority="
false"
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
            <networkConnector name="link2a" duplex="true"
conduitSubscriptions="false" decreaseNetworkConsumerPriority="
false"
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
            <networkConnector name="link3a" duplex="true"
conduitSubscriptions="false" decreaseNetworkConsumerPriority="
false"
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
            <networkConnector name="link4a" duplex="true"
conduitSubscriptions="false" decreaseNetworkConsumerPriority="
false"
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
            <networkConnector name="link5a" duplex="true"
conduitSubscriptions="false" decreaseNetworkConsumerPriority="
false"
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
        </networkConnectors>

and the same section on amq2 like this:
        <networkConnectors>
            <networkConnector name="link1b" duplex="true"
conduitSubscriptions="false" decreaseNetworkConsumerPriority="
false"
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
            <networkConnector name="link2b" duplex="true"
conduitSubscriptions="false" decreaseNetworkConsumerPriority="
false"
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
            <networkConnector name="link3b" duplex="true"
conduitSubscriptions="false" decreaseNetworkConsumerPriority="
false"
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
            <networkConnector name="link4b" duplex="true"
conduitSubscriptions="false" decreaseNetworkConsumerPriority="
false"
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
            <networkConnector name="link5b" duplex="true"
conduitSubscriptions="false" decreaseNetworkConsumerPriority="
false"
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
        </networkConnectors>

(notice the different names, e.g. "link1b" vs "link1a")

And similar on amq3 (which I spare you here).

I ran several tests now and it looks like the failover is happening
correctly with no messages getting lost. I had, however, a few cases where
messages got delivered twice. For example, I sent 100,000 messages from my
producer and the consumer actually received 100,043 messages.  Although not
ideal since I always will have to do duplicate checking, it is better than
losing messages.

One additional note: when the failover happens, the "other" active cluster
node in the network (e.g. amq4) is quite often dumping all the messages it
received but could not acknowledge to amq1 to the log. This is not really a
good behavior since nobody would really scan through hundreds of lines in
the log file to identify those messages. It would be better to setup another
DLQ for that and "dump" the messages there rather than in the log file.

I will run some more tests by changing the GC to G1 hopefully avoiding a
full GC and the demotion of the broker to slave forcing a failover.



--
View this message in context: http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450p4686576.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by Tim Bain <tb...@alumni.duke.edu>.

Take a look at whether the JVM is doing a full garbage collect at the time
when the failover occurs.  Our team has observed clients to failover to an
alternate broker at a time that corresponded to a full GC, and it might be
that the same thing is happening here (but the failover isn't happening
gracefully).  If that's what's going on, you should be able to work around
the problem by tuning your JVM heap and/or your GC strategy, though it
still sounds like there's a bug related to the failover that should be
tracked down and fixed as well.

On Mon, Oct 20, 2014 at 7:36 AM, uromahn <ul...@ulrichromahn.net> wrote:

> Ok, looks like the issue is back again.
>
> The network issues have been fixed.
> It is *not* a slow network - pings between VMs are less than 1ms.
>
> I have not investigated the different throughput but wanted to focus on the
> reliability of the replicated message store.
>
> I made some configuration changes to the network connectors: I defined five
> connectors per node (amq1-3).
>
> Here is what I observed:
> * When I launch one producer connecting to amq1 and one consumer connecting
> to amq4 and send 100,000 messages, everything works fine
> * When I launch five producer connecting to amq1 and five consumer
> connecting to amq4 and send 100,000 messages, still fine
> * When I launch 10 producer connecting to amq1 and 10 consumer connecting
> to
> amq4 and send 100,000 messages, I can see the following:
>   1. number of pending messages in the queue on amq1 is slowly but steadily
> increasing, consumer on amq4 is still reading messages
>   2. after about 70,000 to 80,000 messages amq1 suddenly stops working and
> amq2 gets promoted to master. amq4 is still reading messages
>   3. From that time on, the log of amq4 is filling up with the following
> exceptions: 2014-10-20 13:11:43,227 | ERROR | Exception:
> org.apache.activemq.transport.InactivityIOException: Cannot send, channel
> has already failed: null on duplex forward of: ActiveMQTextMessage ...
> <dump
> of message comes here>
>
> Here is an excerpt of the log from amq1 at the time it got "demoted" to
> slave:
> 2014-10-20 12:56:44,007 | INFO  | Slave has now caught up:
> 2607dbe5-e42a-44bf-8f90-6edf8caa8d87 |
> org.apache.activemq.leveldb.replicated.MasterLevelDBStore |
> hawtdispatch-DEFAULT-1
> 2014-10-20 13:11:42,535 | INFO  | Client session timed out, have not heard
> from server in 2763ms for sessionid 0x2492d8210c30003, closing socket
> connection and attempting reconnect | org.apache.zookeeper.ClientCnxn |
> main-SendThread(uromahn-zk2-9775:2181)
> 2014-10-20 13:11:42,639 | INFO  | Demoted to slave |
> org.apache.activemq.leveldb.replicated.MasterElector | ZooKeeper state
> change dispatcher thread
>
> (NOTE: 12:56 was the time the broker cluster was started. Between that time
> and 13:11, I was running the various tests)
>
> After that I can see a ton of exceptions and error messages saying that the
> replicated store has stopped and similar. After some time, it looks the
> broker amq1 has re-stabilized itself and reporting to have been started as
> slave.
>
> I don't know what exactly is going on, but it appears that something is
> wrong with the replicated LevelDB which needs more investigation.
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450p4686548.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by uromahn <ul...@ulrichromahn.net>.

Ok, looks like the issue is back again.

The network issues have been fixed.
It is *not* a slow network - pings between VMs are less than 1ms.

I have not investigated the different throughput but wanted to focus on the
reliability of the replicated message store.

I made some configuration changes to the network connectors: I defined five
connectors per node (amq1-3).

Here is what I observed:
* When I launch one producer connecting to amq1 and one consumer connecting
to amq4 and send 100,000 messages, everything works fine
* When I launch five producer connecting to amq1 and five consumer
connecting to amq4 and send 100,000 messages, still fine
* When I launch 10 producer connecting to amq1 and 10 consumer connecting to
amq4 and send 100,000 messages, I can see the following:
1. number of pending messages in the queue on amq1 is slowly but steadily
increasing, consumer on amq4 is still reading messages
2. after about 70,000 to 80,000 messages amq1 suddenly stops working and
amq2 gets promoted to master. amq4 is still reading messages
3. From that time on, the log of amq4 is filling up with the following
exceptions: 2014-10-20 13:11:43,227 | ERROR | Exception:
org.apache.activemq.transport.InactivityIOException: Cannot send, channel
has already failed: null on duplex forward of: ActiveMQTextMessage ... <dump
of message comes here>

Here is an excerpt of the log from amq1 at the time it got "demoted" to
slave:
2014-10-20 12:56:44,007 | INFO | Slave has now caught up:
2607dbe5-e42a-44bf-8f90-6edf8caa8d87 |
org.apache.activemq.leveldb.replicated.MasterLevelDBStore |
hawtdispatch-DEFAULT-1
2014-10-20 13:11:42,535 | INFO | Client session timed out, have not heard
from server in 2763ms for sessionid 0x2492d8210c30003, closing socket
connection and attempting reconnect | org.apache.zookeeper.ClientCnxn |
main-SendThread(uromahn-zk2-9775:2181)
2014-10-20 13:11:42,639 | INFO | Demoted to slave |
org.apache.activemq.leveldb.replicated.MasterElector | ZooKeeper state
change dispatcher thread

(NOTE: 12:56 was the time the broker cluster was started. Between that time
and 13:11, I was running the various tests)

After that I can see a ton of exceptions and error messages saying that the
replicated store has stopped and similar. After some time, it looks the
broker amq1 has re-stabilized itself and reporting to have been started as
slave.

I don't know what exactly is going on, but it appears that something is
wrong with the replicated LevelDB which needs more investigation.

--
View this message in context: http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450p4686548.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by Tim Bain <tb...@alumni.duke.edu>.

masterslave: is an alias for failover: with maxReconnects=0.  (There might
be another URI option included in the alias, I don't remember; I think the
details are in the JIRA where Gary added the failover transport, if you're
curious.)  So there's no need to try using failover explicitly, since the
configuration you're using already used the URI option I was concerned
about.

That's strange that you're seeing different throughput (based on different
numbers of pending messages) based on which direction messages flow between
the clusters.  It might be due to the network issues you referenced; if
not, then hopefully you can figure out which link is the slow one by
finding the last point where messages are piling up.  Is there a
non-trivial amount of latency (more that 15ms, let's say) on any of the
links between brokers or the links between clients and brokers?  I've had
to do quite a bit of tuning to get ActiveMQ to run efficiently across a
high-latency WAN, so if you have a bad network link in your setup you may
need to make some adjustments to improve throughput.

Also, just to confirm: were you comparing pending queue sizes based on
which role (producer-side or consumer-side) the cluster was being used for
your test?  (So comparing amq1-3 in your first setup with amq4-6 in your
second setup and vice versa.)  Make sure your comparisons were apples to
apples between the tests, otherwise the conclusion of lower throughput
might not be valid.

On Fri, Oct 17, 2014 at 4:20 AM, uromahn <ul...@ulrichromahn.net> wrote:

> Quick update...
>
> I re-ran my tests as suggested.
>
> First my producer connected to amq4 and the consumer to amq1. That setup
> worked quite well without any error or timeout.
> Then I re-configured it again with messages being sent to amq1 and consumed
> from amq4. To my surprise it worked this time (I re-ran the test three
> times
> yesterday and all three failed the same way!).
> However, I noticed that when transmitting messages from amq1 to amq4, it
> appears that the bridge is slower than the other way around since I saw on
> average 10k pending messages in all queues - the other way around, there
> were on average less than 1,500 msg pending.
>
> On another note: I mentioned that I setup this environment in our private
> cloud infrastructure. This morning I saw a note from our infrastructure
> guys
> that we are having some network issues in that data center hosting this
> environment. It is certainly possible that my issues yesterday may have
> been
> a side-effect of the network problems.
>
> So, I will keep testing but for now, my suspicion is that it is likely to
> be
> a network problem and not an issue within ActiveMQ.
>
> I will follow-up in case the issue shows up again.
>
> Until then, sorry for potentially raising a false alarm.
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450p4686492.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by uromahn <ul...@ulrichromahn.net>.

Quick update...

I re-ran my tests as suggested.

First my producer connected to amq4 and the consumer to amq1. That setup
worked quite well without any error or timeout.
Then I re-configured it again with messages being sent to amq1 and consumed
from amq4. To my surprise it worked this time (I re-ran the test three times
yesterday and all three failed the same way!).
However, I noticed that when transmitting messages from amq1 to amq4, it
appears that the bridge is slower than the other way around since I saw on
average 10k pending messages in all queues - the other way around, there
were on average less than 1,500 msg pending.

On another note: I mentioned that I setup this environment in our private
cloud infrastructure. This morning I saw a note from our infrastructure guys
that we are having some network issues in that data center hosting this
environment. It is certainly possible that my issues yesterday may have been
a side-effect of the network problems.

So, I will keep testing but for now, my suspicion is that it is likely to be
a network problem and not an issue within ActiveMQ.

I will follow-up in case the issue shows up again.

Until then, sorry for potentially raising a false alarm.



--
View this message in context: http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450p4686492.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by uromahn <ul...@ulrichromahn.net>.

tbain98 wrote
> In your broker-to-broker networkConnectors, are you using maxReconnects=0
> as an argument to the failover URI?  It wouldn't explain why amq4 got
> demoted, but it could explain why messages aren't transferring to amq5
> instead.

Here is the definition of my networkConnectors inside activemq.xml:
        
        <networkConnectors>
            <networkConnector name="link1" 
                              duplex="true" 
                              conduitSubscriptions="false" 
                              decreaseNetworkConsumerPriority="false" 
                             
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
            <networkConnector name="link2" 
                              duplex="true" 
                              conduitSubscriptions="false" 
                              decreaseNetworkConsumerPriority="false" 
                             
uri="masterslave:(tcp://uromahn-amq4:61616,tcp://uromahn-amq5:61616,tcp://uromahn-amq6:61616)"/>
        </networkConnectors>

As you can see, I have used the "masterslave" transport instead of
"failover" and have not specified any additional configuration parameters.
I will try, however, to change this to "failover" for another test.

tbain98 wrote
> You say you've got duplex connections between the clusters; which cluster
> is the one that establishes them via a networkConnector?  And do you see
> the same behavior if you put producers on cluster2 and consumers on
> cluster1?

I have not tried that, but will do this first, even before I change
"masterslave" to "failover".

tbain98 wrote
> Also, looking at your logs it's not clear what happens between 13:00:48
> (when amq5 becomes the master) and 13:32:20 (30 minutes later, when the
> LevelDB exception occurs).  Are messages transferring successfully to
> amq5,
> or is it sitting idle?

Oops. that was a copy&paste error. The message at 13:32:20 is actually
caused by me shutting down all my brokers. Sorry for the confusion.
And to answer your question: there were absolutely no messages transferred
from amq1 to either amq4, amq5, or amq6. This was visible by looking at the
"Network" page in the web console of amq1 which showed the number of
messages enqueued and dequeued.

I will post back to this thread once I could run the other suggested tests.

(removed my original post for brevity)





--
View this message in context: http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450p4686488.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Potential Bug in Master-Slave with Replicated LevelDB Store

Posted by Tim Bain <tb...@alumni.duke.edu>.

In your broker-to-broker networkConnectors, are you using maxReconnects=0
as an argument to the failover URI?  It wouldn't explain why amq4 got
demoted, but it could explain why messages aren't transferring to amq5
instead.

You say you've got duplex connections between the clusters; which cluster
is the one that establishes them via a networkConnector?  And do you see
the same behavior if you put producers on cluster2 and consumers on
cluster1?

Also, looking at your logs it's not clear what happens between 13:00:48
(when amq5 becomes the master) and 13:32:20 (30 minutes later, when the
LevelDB exception occurs).  Are messages transferring successfully to amq5,
or is it sitting idle?

On Thu, Oct 16, 2014 at 8:42 AM, uromahn <ul...@ulrichromahn.net> wrote:

> I believe I may have found a bug here. However, this could also be a
> mis-configuration on my side.
>
> Before I go into a detailed description of my observation, here my setup:
>
> I have setup the following system in a virtual environment:
> I have 3 zookeeper nodes.
> I have 6 ActiveMQ broker using 5.10.0
> All nodes (ZK, AMQ) are running on CentOS 6.5 64bit with the latest
> OpenJDK.
> Three broker form an active/passive cluster using replicated LevelDB store.
> I have installed native LevelDB 1,7.0 accessing it via the JNDI driver.
> The two cluster are forming a network of broker.
> The "networkConnectors" are defined in the activemq.xml files in only one
> cluster as duplex connections.
>
> Here is my test case and the situation:
> Let's name the six broker amq1 - amq6. So, the first active/passive cluster
> is amq1, amq2, and amq3 with amq1 active and the other two passive. The
> second cluster consists of amq4, amq5, and amq6 with amq4 as the active.
> I have one producer connecting to amq1 publishing messages to a
> VirtualTopic
> "VirtualTopic.Test" and a consumer connecting to amq4 reading those
> messages
> from a corresponding queue "Consumer.A.VirtualTopic.Test".
> In my test, I am sending 100,000 text messages with a body consisting of
> 1024 random characters to the VirtualTopic at maximum speed.
> However, after about 25,000 to 27,000 messages, the consumer on amq4 times
> out after about 10 seconds not receiving any more messages although the
> producer has already send all 100,000 messages to amq1.
> When looking at the log file of amq4, I am seeing the following messages:
> ....
> 2014-10-16 12:53:12,556 | INFO  | Started responder end of duplex bridge
> link2@ID:uromahn-amq1-9110-48269-1413463991773-0:1 |
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport:
> tcp:///xx.xx.xx.xx:58959@61616
> 2014-10-16 12:53:12,559 | INFO  | Started responder end of duplex bridge
> link1@ID:uromahn-amq1-9110-48269-1413463991773-0:1 |
> org.apache.activemq.broker.TransportConnection | ActiveMQ Transport:
> tcp:///xx.xx.xx.xx:58958@61616
> 2014-10-16 12:53:12,591 | INFO  | Network connection between
> vm://brokergrp2#2 and tcp:///xx.xx.xx.xx:58958@61616 (brokergrp1) has been
> established. | org.apache.activemq.network.DemandForwardingBridgeSupport |
> triggerStartAsyncNetworkBridgeCreation:
> remoteBroker=tcp:///xx.xx.xx.xx:58958@61616, localBroker=
> vm://brokergrp2#2
> 2014-10-16 12:53:12,591 | INFO  | Network connection between
> vm://brokergrp2#0 and tcp:///10.64.253.198:58959@61616 (brokergrp1) has
> been
> established. | org.apache.activemq.network.DemandForwardingBridgeSupport |
> triggerStartAsyncNetworkBridgeCreation:
> remoteBroker=tcp:///xx.xx.xx.xx:58959@61616, localBroker=
> vm://brokergrp2#0
> 2014-10-16 13:00:10,470 | INFO  | Client session timed out, have not heard
> from server in 4071ms for sessionid 0x14918fd40f00007, closing socket
> connection and attempting reconnect | org.apache.zookeeper.ClientCnxn |
> main-SendThread(uromahn-zk1-9208:2181)
> 2014-10-16 13:00:10,575 | INFO  | Demoted to slave |
> org.apache.activemq.leveldb.replicated.MasterElector | ZooKeeper state
> change dispatcher thread
> 2014-10-16 13:00:10,582 | INFO  | Apache ActiveMQ 5.10.0 (brokergrp2,
> ID:uromahn-amq4-9175-46383-1413463934439-0:1) is shutting down |
> org.apache.activemq.broker.BrokerService | ActiveMQ
> BrokerService[brokergrp2] Task-8
> 2014-10-16 13:00:10,594 | WARN  | Transport Connection to:
> tcp://zz.zz.zz.zz:34737 failed: java.io.IOException: Unexpected error
> occured: org.apache.activemq.broker.BrokerStoppedException: Broker
> BrokerService[brokergrp2] is being stopped |
> org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ
> Transport: tcp:///zz.zz.zz.zz:34737@61616
> ....
>
> After that, the log file contains a large number of messages dumped for
> which the broker could not send the acknowledgement back to amq1.
>
> Looking at the log file of the newly promoted master amq5 I see the
> following warning:
> ....
> 2014-10-16 13:00:18,422 | INFO  | Network connection between
> vm://brokergrp2#0 and tcp:///yy.yy.yy.yy:58256@61616 (brokergrp1) has been
> established. | org.apache.activemq.network.DemandForwardingBridgeSupport |
> triggerStartAsyncNetworkBridgeCreation:
> remoteBroker=tcp:///yy.yy.yy.yy:58256@61616, localBroker=
> vm://brokergrp2#0
> 2014-10-16 13:32:20,285 | WARN  | Unexpected session error:
> java.io.IOException: Connection reset by peer |
> org.apache.activemq.leveldb.replicated.MasterLevelDBStore |
> hawtdispatch-DEFAULT-2
> 2014-10-16 13:32:20,286 | INFO  | Slave has disconnected:
> e21da7b3-65a4-4756-a20b-bd77f8eb7607 |
> org.apache.activemq.leveldb.replicated.MasterLevelDBStore |
> hawtdispatch-DEFAULT-2
> ....
>
> After that, there are no messages being transmitted from amq1 to the second
> cluster in the network (neither amq4 nor amq5). This can be seen when
> looking at the corresponding tab of the web console of amq1. Also, when I
> let all broker running, I am getting a message on amq4 that the broker is
> waiting for the replicated store to synchronize one record. (or something
> to
> this extend - unfortunately I deleted the log already. :( ).
>
> So, here is what appears to happen:
> somehow Zookeeper things that amq4 is "dead" and then demotes it to slave
> in
> the middle of the message transfer. This could be because amq4 is too busy
> to respond to heatbeat requests. However, something goes wrong in the
> LevelDB replication and the new master amq5 can't pick up. This seem to
> lead
> to a "dead-lock" situation causing the client to time-out.
> NOTE: all clients are using the failover transport to connect to the
> cluster. The connector between the two cluster is also defined using the
> failover transport.
> Another indication of something "bad" is the fact that the client reports
> an
> exception that the connection to amq4 was reset and it is trying to
> reconnect. However, that connection does not seem to happen within 10
> seconds because my message listener has an internal timeout of 10 seconds.
> I have double and triple-checked the configuration and everything looks ok.
> Also, this doesn't happen in case both producer and consumer connect to the
> same cluster, e.g. amq1 or amq4. So, the individual cluster seem to work.
> The issue only seem to appear when transmitting messages between clusters
> in
> a network of broker.
>
> Any idea if this is a bug or just a SUE (stupid user error). :)
>
> -Uli
>
>
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Potential-Bug-in-Master-Slave-with-Replicated-LevelDB-Store-tp4686450.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>