You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Joe Niski <jo...@nwea.org> on 2011/06/09 18:42:14 UTC
Durable subscriptions not surviving network disconnect
i'm encountering a problem in our production environment that i can
reproduce in an integration-testing setup. Durable topic subscriptions
do not fully reconnect after an interruption in network connectivity,
even though the ActiveMQ brokers re-establish their connection and
messages flow across queues.
The high-level architecture is a store & forward setup similar to the
backoffice + retail store example in "ActiveMQ in Action":
- the Central application (on Geronimo 2.1.7) and Central ActiveMQ
(5.4.2) broker run on the same machine.
- multiple remote machines host a similar pairing of Remote ActiveMQ and
Remote application.
- The apps are connecting to the standalone AMQ brokers via activemq-ra,
ignoring the AMQ instance embedded in Geronimo.
- the Central app publishes to topics on the Central broker. The topics
are dynamically included in the Remote brokers' networkConnector
configuration, which looks like this:
<networkConnectors>
<networkConnector name="${ApplianceID}"
userName="${networkConnectorUserName}"
password="${networkConnectorPassword}"
uri="static://(ssl://${Central.ServerHostname}:${central_sslPortNumber})?initialReconnectDelay=5000&maxReconnectDelay=10000&useExponentialBackOff=false"
duplex="true"
dynamicOnly="true">
<dynamicallyIncludedDestinations>
<queue physicalName="org.nwea.queues.central.>"/>
<topic physicalName="org.nwea.topics.>"/>
</dynamicallyIncludedDestinations>
</networkConnector>
</networkConnectors>
- MDBs in the Remote application use durable subscriptions to connect to
the topics on the Remote broker. We see the durable subs show up on the
Central broker (via the web console).
Whenever there's a temporary loss of network connectivity (this happens
form time to time with the provide hosting our Remotes), the Remote
brokers can re-connect to the Central broker, but the durable
subscriptions from Remote do not re-connect. They show up in the Remote
broker's web console, but not in Central's. Messages on the Central
broker's topics are not forwarded to the Remote broker's topics.
i've duplicated this behavior in our VMWare environment, the only place
i can enable debug-level logging:
- i start a batch-publishing job on Central, watch the messages picked
up and processed by Remote, then disable the network interface on Remote
(i've done this for up to a minute so far). Central keeps publishing,
and Remote finishes processing messages that were forwarded to its topics.
- i re-enable Remote's network interface, and see in the ActiveMQ logs
that Remote authenticates to Central and that the DemandForwardingBridge
is re-established. i see messages flowing on Advisory topics. i can send
a message (via the Remote's AMQ console) to a dynamically included
queue, and it's forwarded to Central. In Remote's AMQ console, i see the
durable subscriptions form the Remote application's MDBs - but in
Central's AMQ console, the durable subs appear as "offline".
The only way we've discovered to bring the durable subscriptions back
on-line all the way to Central is to restart the Remote Geronimo
instance. Once restarted, Remote picks up where it left off, and all the
topic messages are retrieved and processed.
In the debug logs, we've noticed that when Remote AMQ re-connects after
the outage, queue and topic connections seem to use different ports than
before the outage, and wonder if this is part of the failure of durable
subscriptions to reconnect.
i've already tried a few minor variations in the networkConnector
configuration, the most recent being "useExponentialBackOff=false". In
addition, i've enabled TCP keepalive in the transportConnectors:
<transportConnectors>
<transportConnector name="openwire"
uri="tcp://0.0.0.0:${remote_openwirePortNumber}?keepAlive=true"/>
<transportConnector name="ssl"
uri="ssl://0.0.0.0:${remote_sslPortNumber}?keepAlive=true"/>
</transportConnectors>
We've already looked at various operating-system issues with the network
stacks on our servers, and nothing seems to be amiss - no
resource-starvation of any kind. And the point really is that we need
the durable subs to survive a brief disconnect. AMQ itself seems to
reconnect just fine. At the moment, getting rid of activemq-ra and the
Geronimo resource adapters and moving to Spring's JMS support (as one
consultant suggested) isn't an option for our production issues,
regardless of how attractive it is in the bigger scheme of things.
This is a real problem for us and our customers. Any guidance is
appreciated.
--
*Joe Niski*
Senior Developer - Information Services | NWEA™
PHONE 503.548.5207 | FAX 503.639.7873
NWEA.ORG <http://www.nwea.org/> | Partnering to help all kids learn™
Re: Durable subscriptions not surviving network disconnect
Posted by Andreas Calvo <fl...@gmail.com>.
Thank you for answering.
This is the scenario:
(embedded) broker A <---> standalone broker B <---> (embedded) broker C
it is a hub-and-spoke network of brokers, so broker B gets and dispatches
all messages.
we start establishing the connection for all the brokers and start sending
messages from A to C and from C to A.
If we disconnect broker A for more than 30s (or the max innactivity timeout
set up), when we reconnect it again, broker A will not get all messages that
have been sent from broker C (that should be held on broker B) during this
disconnection time.
However, broker C gets all messages that have been sent from broker A.
--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3683244.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Durable subscriptions not surviving network disconnect
Posted by Torsten Mielke <to...@fusesource.com>.
Not sure I did understand your scenario. Can you please describe it again using broker names A,B and C?
> when we disconnect one of
> the embedded brokers from the network, it does not get the pending messages
A broker that is disconnected from the network won't be able to get any messages right?
I guess I did not fully understand the scenario.
Torsten Mielke
torsten@fusesource.com
tmielke@blogspot.com
Re: Durable subscriptions not surviving network disconnect
Posted by Andreas Calvo <fl...@gmail.com>.
We've been playing a little bit more, and we faced another issue.
Having a network of brokers, two of them embedded on a java application
connect to a central standalone activemq server, when we disconnect one of
the embedded brokers from the network, it does not get the pending messages
from the the other broker although the other broker (who hasn't be
disconnected) gets all the messages from the other broker.
Any hint?
thanks
(posting here so we keep focused the JIRA issue)
--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3680978.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Durable subscriptions not surviving network disconnect
Posted by Andreas Calvo <fl...@gmail.com>.
Attached to JIRA issue: https://issues.apache.org/jira/browse/AMQ-3353
However, we've found that closing the session and reopening it solves the
problem.
--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3648272.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Durable subscriptions not surviving network disconnect
Posted by Dejan Bosanac <de...@nighttale.net>.
Can you raise a Jira and attach this, so it doesn't get lost in emails?
Regards
--
Dejan Bosanac - http://twitter.com/dejanb
-----------------
The experts in open source integration and messaging - http://fusesource.com
ActiveMQ in Action - http://www.manning.com/snyder/
Blog - http://www.nighttale.net
On Tue, Jul 5, 2011 at 7:33 PM, Andreas Calvo <fl...@gmail.com> wrote:
> This JUnit test reproduces the error: http://pastebin.com/RXCVHzGt
>
> --
> View this message in context:
> http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3646615.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
Re: Durable subscriptions not surviving network disconnect
Posted by Andreas Calvo <fl...@gmail.com>.
This JUnit test reproduces the error: http://pastebin.com/RXCVHzGt
--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3646615.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Durable subscriptions not surviving network disconnect
Posted by Andreas Calvo <fl...@gmail.com>.
I've started a JUnit Test case.
Since it's the first time, it's not clean, may be buggy and the behavior is
not what I really expected, but maybe it's a start.
org.apache.activemq.usecases.DurableSubscriberWithNetworkDisconnectTest:
http://pastebin.com/kr9uu0uE
--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3646435.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Durable subscriptions not surviving network disconnect
Posted by Andreas Calvo <fl...@gmail.com>.
Thanks Gary.
I'm using ConnectionInfo and RemoveInfo to detect (re)connections.
However, I'm trying to tell the client to (re)subscribe againt to the topic,
like if I restarted the whole client, but it's not working.
I've been playing with TransportListener and some advisory message (don't
know what's the best for an embedded broker), with a pseudo-code like this.
I've been looking for more java examples (to get the number of connection
that a broker has, to know the vm transport connection, and so on).
-- producer --
startBroker()
createConnection()
createTopic()
createProducer()
publish()
-- consumer --
startBroker()
createConnection()
createTopic()
createDurableSubscriber()
onMessage()
if is ConnectionInfo then restartSubscription()
PS: I could post all the java code if needed
Thanks
--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3637976.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Durable subscriptions not surviving network disconnect
Posted by Gary Tully <ga...@gmail.com>.
There is a transport listener interface that can give you indications
of reconnects.
see: org.apache.activemq.ActiveMQConnection#addTransportListener
For a unit test, have a look at:
org.apache.activemq.usecases.BrokerQueueNetworkWithDisconnectTest
this uses a socket proxy to simulate a network failure between
networked brokers.
A simple network test with durable subs is:
org.apache.activemq.network.SimpleNetworkTest#testDurableStoreAndForward
you may need a combination of the two.
On 30 June 2011 15:09, Andreas Calvo <fl...@gmail.com> wrote:
> Sorry for the late response.
>
> There is already a Jira issue
> (https://issues.apache.org/jira/browse/AMQ-3353).
> While I do know how to reproduce it using multicast brokers and producer and
> consumer from the example directory, I do not know how to make a junit case.
> If it's not difficult, I could try to do it.
>
> In the meantime, how can I capture when a client receives a disconnect from
> the broker and restarts the connection?
> I'm trying with advisory messages or execeptions in an embedded broker,
> without succeed.
> Any example of code that I could look at?
>
> thanks
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3635766.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
--
http://fusesource.com
http://blog.garytully.com
Re: Durable subscriptions not surviving network disconnect
Posted by Andreas Calvo <fl...@gmail.com>.
Sorry for the late response.
There is already a Jira issue
(https://issues.apache.org/jira/browse/AMQ-3353).
While I do know how to reproduce it using multicast brokers and producer and
consumer from the example directory, I do not know how to make a junit case.
If it's not difficult, I could try to do it.
In the meantime, how can I capture when a client receives a disconnect from
the broker and restarts the connection?
I'm trying with advisory messages or execeptions in an embedded broker,
without succeed.
Any example of code that I could look at?
thanks
--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3635766.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Durable subscriptions not surviving network disconnect
Posted by Gary Tully <ga...@gmail.com>.
can you raise a jira with your description or try and recreate in a
junit test case
On 23 June 2011 11:50, Andreas Calvo <fl...@gmail.com> wrote:
> We've been stuck with the same problem (in a different scenario).
> We've tried to replicate the same behavior using the examples on activemq
> (ant producer, ant consumer, and static-network-broker), and, as Joe said,
> the activemq seems to reconnect just fine, but the clients get stuck.
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3619470.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
--
http://fusesource.com
http://blog.garytully.com
Re: Durable subscriptions not surviving network disconnect
Posted by Andreas Calvo <fl...@gmail.com>.
We've been stuck with the same problem (in a different scenario).
We've tried to replicate the same behavior using the examples on activemq
(ant producer, ant consumer, and static-network-broker), and, as Joe said,
the activemq seems to reconnect just fine, but the clients get stuck.
--
View this message in context: http://activemq.2283324.n4.nabble.com/Durable-subscriptions-not-surviving-network-disconnect-tp3586149p3619470.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.