You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Julian Scheid <ju...@googlemail.com> on 2007/09/05 14:15:33 UTC
Network-of-brokers re-synchronization after network disconnect?
Hi,
I have a durable topic distributed over two broker nodes and it's
working just fine, however messages get lost when I artificially
disconnect and later reconnect one of the brokers. To elaborate:
I've set up two broker nodes on two different hosts, broker B1 on host
H1 and broker B2 on host H2. (In the production environment, these two
hosts will be in two separate LANs that are connected through a WAN for
which there's no guaranteed 100% availability, hence my
disconnect/reconnect tests.)
Each broker is configured with a static network connection to forward to
the broker on the other host, so B1 is configured to forward to B2 and
B2 is configured to forward to B1. See below for the corresponding
snippets from the configuration files.
On each host, I'm running a subscriber connected to the broker running
on the same host, so I have subscriber S1 on H1 connected to B1, and S2
on H2 connected to B2. Both subscribers are subscribed to the same topic
T. (I've tried both durable and non-durable subscriptions.)
I'm also running a publisher on each host connected to the broker on
localhost, so publisher P1 is running on H1 connected to B1, and P2 on
H2 connected to B2, both publishing to the same topic T the subscribers
are listening to. Both publishers are configured to send persistent
messages. (I've tried both with infinite expiry and other expiry values,
say 100 seconds).
To summarize, my setup looks like this:
S1 S2 (subscribers)
| |
B1 <-----> B2 (brokers)
| |
P1 P2 (publishers)
(Host H1) (Host H2)
Now, in the normal case everything works as expected. If P1 sends a test
message to topic T, both S1 and S2 get the message. Same if P2 sends a
message. So the forwarding of messages between the two brokers
apparently works fine.
My tests with durable subscriptions work fine too - if I temporarily
unsubscribe, say S2, and then resubscribe it later, it gets any messages
sent while it was unsubscribed - no matter whether those messages were
sent from P1 or P2.
However, if I artificially disconnect host H2 from the network (by
pulling the network cable) and then send a message from P1 to B1, that
message will not be received by S2 after I reconnect H2 to the network.
(It will obviously be received by S1 running on the same host as the
publisher. It also WILL be received by S2 if I remove H2 from the
network for only a short amount of time, maybe 5-10 seconds - but any
longer, the message will get lost.)
I've tried re-subscribing S2 after reconnecting H2, but that didn't seem
to help even in the case of a durable subscription, and it probably
wouldn't be an acceptable solution anyway because then the subscribers
would need to pay extra attention to network connectivity.
I've cranked up the log level to DEBUG and tried to find any hint in the
broker logs, maybe something about a message dropped but couldn't find
anything suspicious.
I've tried all of the above with both ActiveMQ 4.1.1 and the 5.0
snapshot as of yesterday, September 4th.
I've also tried sending messages directly from the web console just to
make sure that there's nothing wrong with my publishers, double-checking
that messages are sent with persistent delivery.
Am I wrong to expect that B1 and B2 should re-synchronize after the
connection between them has been rebuilt, or is maybe my forwarding
configuration wrong? How could I go about debugging what's happening to
the message that's sent while H2 is down, whether it ever gets
replicated from B1 to B2 and if not, why not?
Please let me know if you need the full configuration files or log files.
Thanks in advance for any advise,
Julian
Configuration for broker B1 running on host H1:
<networkConnectors>
<networkConnector uri="static:(tcp://H2:61616)"/>
</networkConnectors>
Configuration for B2 running on H2:
<networkConnectors>
<networkConnector uri="static:(tcp://H1:61616)"/>
</networkConnectors>
Re: Network-of-brokers re-synchronization after network disconnect?
Posted by Julian Scheid <ju...@googlemail.com>.
Julian Scheid wrote:
> I have a durable topic distributed over two broker nodes and it's
> working just fine, however messages get lost when I artificially
> disconnect and later reconnect one of the brokers.
This could be related to http://issues.apache.org/activemq/browse/AMQ-1076