You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Joe Niski <jo...@nwea.org> on 2011/06/09 18:42:14 UTC

Durable subscriptions not surviving network disconnect

i'm encountering a problem in our production environment that i can 
reproduce in an integration-testing setup. Durable topic subscriptions 
do not fully reconnect after an interruption in network connectivity, 
even though the ActiveMQ brokers re-establish their connection and 
messages flow across queues.

The high-level architecture is a store & forward setup similar to the 
backoffice + retail store example in "ActiveMQ in Action":

- the Central application (on Geronimo 2.1.7) and Central ActiveMQ 
(5.4.2) broker run on the same machine.

- multiple remote machines host a similar pairing of Remote ActiveMQ and 
Remote application.

- The apps are connecting to the standalone AMQ brokers via activemq-ra, 
ignoring the AMQ instance embedded in Geronimo.

- the Central app publishes to topics on the Central broker. The topics 
are dynamically included in the Remote brokers' networkConnector 
configuration, which looks like this:

<networkConnectors>
<networkConnector name="${ApplianceID}"
                               userName="${networkConnectorUserName}"
                               password="${networkConnectorPassword}"
                               
uri="static://(ssl://${Central.ServerHostname}:${central_sslPortNumber})?initialReconnectDelay=5000&amp;maxReconnectDelay=10000&amp;useExponentialBackOff=false"
                               duplex="true"
                               dynamicOnly="true">
<dynamicallyIncludedDestinations>
<queue physicalName="org.nwea.queues.central.>"/>
<topic physicalName="org.nwea.topics.>"/>
</dynamicallyIncludedDestinations>
</networkConnector>
</networkConnectors>

- MDBs in the Remote application use durable subscriptions to connect to 
the topics on the Remote broker. We see the durable subs show up on the 
Central broker (via the web console).

Whenever there's a temporary loss of network connectivity (this happens 
form time to time with the provide hosting our Remotes), the Remote 
brokers can re-connect to the Central broker, but the durable 
subscriptions from Remote do not re-connect. They show up in the Remote 
broker's web console, but not in Central's. Messages on the Central 
broker's topics are not forwarded to the Remote broker's topics.

i've duplicated this behavior in our VMWare environment, the only place 
i can enable debug-level logging:

- i start a batch-publishing job on Central, watch the messages picked 
up and processed by Remote, then disable the network interface on Remote 
(i've done this for up to a minute so far). Central keeps publishing, 
and Remote finishes processing messages that were forwarded to its topics.

- i re-enable Remote's network interface, and see in the ActiveMQ logs 
that Remote authenticates to Central and that the DemandForwardingBridge 
is re-established. i see messages flowing on Advisory topics. i can send 
a message (via the Remote's AMQ console) to a dynamically included 
queue, and it's forwarded to Central. In Remote's AMQ console, i see the 
durable subscriptions form the Remote application's MDBs - but in 
Central's AMQ console, the durable subs appear as "offline".

The only way we've discovered to bring the durable subscriptions back 
on-line all the way to Central is to restart the Remote Geronimo 
instance. Once restarted, Remote picks up where it left off, and all the 
topic messages are retrieved and processed.

In the debug logs, we've noticed that when Remote AMQ re-connects  after 
the outage, queue and topic connections seem to use different ports than 
before the outage, and wonder if this is part of the failure of durable 
subscriptions to reconnect.

i've already tried a few minor variations in the networkConnector 
configuration, the most recent being "useExponentialBackOff=false". In 
addition, i've enabled TCP keepalive in the transportConnectors:

<transportConnectors>
<transportConnector name="openwire" 
uri="tcp://0.0.0.0:${remote_openwirePortNumber}?keepAlive=true"/>
<transportConnector name="ssl" 
uri="ssl://0.0.0.0:${remote_sslPortNumber}?keepAlive=true"/>
</transportConnectors>

We've already looked at various operating-system issues with the network 
stacks on our servers, and nothing seems to be amiss - no 
resource-starvation of any kind. And the point really is that we need 
the durable subs to survive a brief disconnect. AMQ itself seems to 
reconnect just fine. At the moment, getting rid of activemq-ra and the 
Geronimo resource adapters and moving to Spring's JMS support (as one 
consultant suggested) isn't an option for our production issues, 
regardless of how attractive it is in the bigger scheme of things.

This is a real problem for us and our customers. Any guidance is 
appreciated.
-- 

*Joe Niski*
Senior Developer - Information Services  |  NWEA™

PHONE 503.548.5207 | FAX 503.639.7873

NWEA.ORG <http://www.nwea.org/> | Partnering to help all kids learn™


Re: Durable subscriptions not surviving network disconnect

Posted by Andreas Calvo <fl...@gmail.com>.
Thank you for answering.

This is the scenario:

(embedded) broker A <---> standalone broker B <---> (embedded) broker C

it is a hub-and-spoke network of brokers, so broker B gets and dispatches
all messages.
we start establishing the connection for all the brokers and start sending
messages from A to C and from C to A.

If we disconnect broker A for more than 30s (or the max innactivity timeout
set up), when we reconnect it again, broker A will not get all messages that
have been sent from broker C (that should be held on broker B) during this
disconnection time.
However, broker C gets all messages that have been sent from broker A.

--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3683244.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Durable subscriptions not surviving network disconnect

Posted by Torsten Mielke <to...@fusesource.com>.
Not sure I did understand your scenario. Can you please describe it again using broker names A,B and C?

>  when we disconnect one of
> the embedded brokers from the network, it does not get the pending messages

A broker that is disconnected from the network won't be able to get any messages right?

I guess I did not fully understand the scenario.


Torsten Mielke
torsten@fusesource.com
tmielke@blogspot.com




Re: Durable subscriptions not surviving network disconnect

Posted by Andreas Calvo <fl...@gmail.com>.
We've been playing a little bit more, and we faced another issue.

Having a network of brokers, two of them embedded on a java application
connect to a central standalone activemq server, when we disconnect one of
the embedded brokers from the network, it does not get the pending messages
from the the other broker although the other broker (who hasn't be
disconnected) gets all the messages from the other broker.

Any hint?

thanks

(posting here so we keep focused the JIRA issue)

--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3680978.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Durable subscriptions not surviving network disconnect

Posted by Andreas Calvo <fl...@gmail.com>.
Attached to JIRA issue: https://issues.apache.org/jira/browse/AMQ-3353

However, we've found that closing the session and reopening it solves the
problem.

--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3648272.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Durable subscriptions not surviving network disconnect

Posted by Dejan Bosanac <de...@nighttale.net>.
Can you raise a Jira and attach this, so it doesn't get lost in emails?

Regards
-- 
Dejan Bosanac - http://twitter.com/dejanb
-----------------
The experts in open source integration and messaging - http://fusesource.com
ActiveMQ in Action - http://www.manning.com/snyder/
Blog - http://www.nighttale.net


On Tue, Jul 5, 2011 at 7:33 PM, Andreas Calvo <fl...@gmail.com> wrote:

> This JUnit test reproduces the error: http://pastebin.com/RXCVHzGt
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3646615.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>

Re: Durable subscriptions not surviving network disconnect

Posted by Andreas Calvo <fl...@gmail.com>.
This JUnit test reproduces the error: http://pastebin.com/RXCVHzGt

--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3646615.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Durable subscriptions not surviving network disconnect

Posted by Andreas Calvo <fl...@gmail.com>.
I've started a JUnit Test case.

Since it's the first time, it's not clean, may be buggy and the behavior is
not what I really expected, but maybe it's a start.

org.apache.activemq.usecases.DurableSubscriberWithNetworkDisconnectTest:
http://pastebin.com/kr9uu0uE

--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3646435.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Durable subscriptions not surviving network disconnect

Posted by Andreas Calvo <fl...@gmail.com>.
Thanks Gary.

I'm using ConnectionInfo and RemoveInfo to detect (re)connections.
However, I'm trying to tell the client to (re)subscribe againt to the topic,
like if I restarted the whole client, but it's not working.

I've been playing with TransportListener and some advisory message (don't
know what's the best for an embedded broker), with a pseudo-code like this.

I've been looking for more java examples (to get the number of connection
that a broker has, to know the vm transport connection, and so on).

-- producer --
startBroker()
createConnection()
createTopic()
createProducer()
publish()

-- consumer --
startBroker()
createConnection()
createTopic()
createDurableSubscriber()

onMessage()
if is ConnectionInfo then restartSubscription()

PS: I could post all the java code if needed

Thanks

--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3637976.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Durable subscriptions not surviving network disconnect

Posted by Gary Tully <ga...@gmail.com>.
There is a transport listener interface that can give you indications
of reconnects.
see: org.apache.activemq.ActiveMQConnection#addTransportListener

For a unit test, have a look at:
org.apache.activemq.usecases.BrokerQueueNetworkWithDisconnectTest

this uses a socket proxy to simulate a network failure between
networked brokers.

A simple network test with durable subs is:
org.apache.activemq.network.SimpleNetworkTest#testDurableStoreAndForward
you may need a combination of the two.


On 30 June 2011 15:09, Andreas Calvo <fl...@gmail.com> wrote:
> Sorry for the late response.
>
> There is already a Jira issue
> (https://issues.apache.org/jira/browse/AMQ-3353).
> While I do know how to reproduce it using multicast brokers and producer and
> consumer from the example directory, I do not know how to make a junit case.
> If it's not difficult, I could try to do it.
>
> In the meantime, how can I capture when a client receives a disconnect from
> the broker and restarts the connection?
> I'm trying with advisory messages or execeptions in an embedded broker,
> without succeed.
> Any example of code that I could look at?
>
> thanks
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3635766.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>



-- 
http://fusesource.com
http://blog.garytully.com

Re: Durable subscriptions not surviving network disconnect

Posted by Andreas Calvo <fl...@gmail.com>.
Sorry for the late response.

There is already a Jira issue
(https://issues.apache.org/jira/browse/AMQ-3353).
While I do know how to reproduce it using multicast brokers and producer and
consumer from the example directory, I do not know how to make a junit case.
If it's not difficult, I could try to do it.

In the meantime, how can I capture when a client receives a disconnect from
the broker and restarts the connection?
I'm trying with advisory messages or execeptions in an embedded broker,
without succeed.
Any example of code that I could look at?

thanks

--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3635766.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Durable subscriptions not surviving network disconnect

Posted by Gary Tully <ga...@gmail.com>.
can you raise a jira with your description or try and recreate in a
junit test case

On 23 June 2011 11:50, Andreas Calvo <fl...@gmail.com> wrote:
> We've been stuck with the same problem (in a different scenario).
> We've tried to replicate the same behavior using the examples on activemq
> (ant producer, ant consumer, and static-network-broker), and, as Joe said,
> the activemq seems to reconnect just fine, but the clients get stuck.
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3619470.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>



-- 
http://fusesource.com
http://blog.garytully.com

Re: Durable subscriptions not surviving network disconnect

Posted by Andreas Calvo <fl...@gmail.com>.
We've been stuck with the same problem (in a different scenario).
We've tried to replicate the same behavior using the examples on activemq
(ant producer, ant consumer, and static-network-broker), and, as Joe said,
the activemq seems to reconnect just fine, but the clients get stuck.

--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3619470.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.