You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Simon Vicary <na...@btclick.com> on 2007/07/18 11:14:23 UTC

ActiveMQ and Durable Topic subscriptions after subscriber is uncleanly terminated

Hi all,

I have been having some issues which are currently show-stoppers with the
use of Durable subscriptions with Active MQ Topics for our large scale
integration project.

I've been writing the subscriber in C#, but the issue also remains for the
Java implementation. The standard approach for establishing a durable
subscriber is to perform all the standard steps for setting up a subscriber,
along with setting a ClientID (to provide the unique ID for the subscriber
application), and calling CreateDurableConsumer on the session.

On the first attempt, the subscription is established, and the messages are
correctly received.  If the subscriber is then shutdown in a controlled
manner manor, and the connection is correctly stopped and closed, then
subsequent restarts of the subscriber will perform as expected. Great, all
works fine.

However, under real failure scenarios (machine goes pop, or goes offline for
some reason resulting in a subscriber restart) the connection doesn't have
chance to correctly terminate the connection with the broker - essentially
an "uncontrolled shutdown".  This is where the problem arises.  If the
subscriber now attempts to establish a durable subscription with the same
ClientID and name as before, the broker returns a 'Client XXX already
connected' error, and prevents the connection from being made - even though
the previous client/subscriber is not actually connected, or even running. 
This doesn’t seem to be time bound either - even waiting for a period of
time (minutes) and retrying, will produce the same results, so it's not the
socket in TIME_WAIT state which is causing it.

After further investigation, I've discovered the following:

Using Jconsole to look into the state of the broker, it seems that,
following an uncontrolled client disconnect (as previously performed) the
previously created Connection instance is still classed (by the broker) as
being both live and connected, although it blatantly isn't connected (or
even live), because the client is no longer there.  This is persisted by the
broker, and never seems to be cleared (until a broker restart, which is
unacceptable in an enterprise scale environment just to recover from a
single subscriber failure)

The broker should detect the socket disconnect from the failed subscriber,
and clean up the connection status in the broker.

Another observation is that if the connection is manually cleared using
Jconsole (using the relevant operation on the connection instance), the
subscriber can indeed reconnect using the durable subscription.

Another observation is that this only happens if NO messages are published
to the topic during the subscriber downtime.  If however a message is
published to the topic during the subscriber downtime, the broker will
detect that the subscriber is no longer live, and clear up the connection. 
This results in the subscriber being able to reconnect successfully. 
However, in production environments, we cannot guarantee that a message will
be sent on a topic during the subscriber downtime - although most topics
will have high utilisation, some have low throughput - but this cannot be
relied upon, and the failure of a single durable subscription will result in
the failure of the complete subscribing application.

It seems that all the ActiveMQ unit tests (or the ones I've looked at) to
test the durability of the connection, perform orderly shutdown of the
connection during the test.  This results in the broker correctly cleaning
the connection status, and the remaining tests being successful.

Under other JMS implementations (namely Tibco EMS but I've performed similar
in the past with JBossMQ), this doesn't happen.  Many JMS resources specify
that if a durable subscription is attempted and one is already established,
then the existing subscription is overwritten, and the new one is
established.  This doesn't seem to be the case with ActiveMQ - instead it
throws an exception.

My main questions to the ActiveMQ forum are:
1) Is there a workaround for this to allow subsequent durable subscriptions
to work following an "uncontrolled" subscriber shutdown?
2) Does ActiveMQ have a configuration parameter to allow subsequent durable
subscriptions to overwrite existing ones (even if the existing ones are
actually dead connections)
3) Is there anything within ActiveMQ which can periodically test the
connections in the broker to see if they are still live - if not, then clean
them up to overcome this problem
4) Has anybody else experienced this issue in a production quality
environment or otherwise - I've seen many posts to do with 'Client XXX
already connected' but nothing which resolves the issue other than 'fixed in
the 4.1…. Release'.  We are using 4.1.1 so we should see the fix - this
sounds like another issue which has slipped though the net.

Any feedback on this would be much appreciated.

Kind regards

Simon Vicary
Integration and Technical Delivery Lead.
-- 
View this message in context: http://www.nabble.com/ActiveMQ-and-Durable-Topic-subscriptions-after-subscriber-is-uncleanly-terminated-tf4102045s2354.html#a11665143
Sent from the ActiveMQ - User mailing list archive at Nabble.com.


Re: ActiveMQ and Durable Topic subscriptions after subscriber is uncleanly terminated

Posted by Simon Vicary <na...@btclick.com>.
Hi, 

The configuration I'm using is straight out of the box for the ActiveMQ
4.1.1 disto.  running on Windows XP (dev) and 2003 (test) (soon to be on
Solaris)

Is there a particular section of the activemq.xml (we've not changed the
default) file you were wondering about or was there some other information I
can provide.

If it's a configuration fix, then that'll be great - some pointers of where
to look as to which configuration parameters might overcome this issue will
also be good if you have them.

Kind regards

Simon Vicary



rajdavies wrote:
> 
> Hopefully your problem can be helped with some extra configuration -  
> can you tell us what you are using at the moment ?
> 
> On Jul 18, 2007, at 10:14 AM, Simon Vicary wrote:
> 
>>
>> Hi all,
>>
>> I have been having some issues which are currently show-stoppers  
>> with the
>> use of Durable subscriptions with Active MQ Topics for our large scale
>> integration project.
>>
>> I've been writing the subscriber in C#, but the issue also remains  
>> for the
>> Java implementation. The standard approach for establishing a durable
>> subscriber is to perform all the standard steps for setting up a  
>> subscriber,
>> along with setting a ClientID (to provide the unique ID for the  
>> subscriber
>> application), and calling CreateDurableConsumer on the session.
>>
>> On the first attempt, the subscription is established, and the  
>> messages are
>> correctly received.  If the subscriber is then shutdown in a  
>> controlled
>> manner manor, and the connection is correctly stopped and closed, then
>> subsequent restarts of the subscriber will perform as expected.  
>> Great, all
>> works fine.
>>
>> However, under real failure scenarios (machine goes pop, or goes  
>> offline for
>> some reason resulting in a subscriber restart) the connection  
>> doesn't have
>> chance to correctly terminate the connection with the broker -  
>> essentially
>> an "uncontrolled shutdown".  This is where the problem arises.  If the
>> subscriber now attempts to establish a durable subscription with  
>> the same
>> ClientID and name as before, the broker returns a 'Client XXX already
>> connected' error, and prevents the connection from being made -  
>> even though
>> the previous client/subscriber is not actually connected, or even  
>> running.
>> This doesn’t seem to be time bound either - even waiting for a  
>> period of
>> time (minutes) and retrying, will produce the same results, so it's  
>> not the
>> socket in TIME_WAIT state which is causing it.
>>
>> After further investigation, I've discovered the following:
>>
>> Using Jconsole to look into the state of the broker, it seems that,
>> following an uncontrolled client disconnect (as previously  
>> performed) the
>> previously created Connection instance is still classed (by the  
>> broker) as
>> being both live and connected, although it blatantly isn't  
>> connected (or
>> even live), because the client is no longer there.  This is  
>> persisted by the
>> broker, and never seems to be cleared (until a broker restart,  
>> which is
>> unacceptable in an enterprise scale environment just to recover from a
>> single subscriber failure)
>>
>> The broker should detect the socket disconnect from the failed  
>> subscriber,
>> and clean up the connection status in the broker.
>>
>> Another observation is that if the connection is manually cleared  
>> using
>> Jconsole (using the relevant operation on the connection instance),  
>> the
>> subscriber can indeed reconnect using the durable subscription.
>>
>> Another observation is that this only happens if NO messages are  
>> published
>> to the topic during the subscriber downtime.  If however a message is
>> published to the topic during the subscriber downtime, the broker will
>> detect that the subscriber is no longer live, and clear up the  
>> connection.
>> This results in the subscriber being able to reconnect successfully.
>> However, in production environments, we cannot guarantee that a  
>> message will
>> be sent on a topic during the subscriber downtime - although most  
>> topics
>> will have high utilisation, some have low throughput - but this  
>> cannot be
>> relied upon, and the failure of a single durable subscription will  
>> result in
>> the failure of the complete subscribing application.
>>
>> It seems that all the ActiveMQ unit tests (or the ones I've looked  
>> at) to
>> test the durability of the connection, perform orderly shutdown of the
>> connection during the test.  This results in the broker correctly  
>> cleaning
>> the connection status, and the remaining tests being successful.
>>
>> Under other JMS implementations (namely Tibco EMS but I've  
>> performed similar
>> in the past with JBossMQ), this doesn't happen.  Many JMS resources  
>> specify
>> that if a durable subscription is attempted and one is already  
>> established,
>> then the existing subscription is overwritten, and the new one is
>> established.  This doesn't seem to be the case with ActiveMQ -  
>> instead it
>> throws an exception.
>>
>> My main questions to the ActiveMQ forum are:
>> 1) Is there a workaround for this to allow subsequent durable  
>> subscriptions
>> to work following an "uncontrolled" subscriber shutdown?
>> 2) Does ActiveMQ have a configuration parameter to allow subsequent  
>> durable
>> subscriptions to overwrite existing ones (even if the existing ones  
>> are
>> actually dead connections)
>> 3) Is there anything within ActiveMQ which can periodically test the
>> connections in the broker to see if they are still live - if not,  
>> then clean
>> them up to overcome this problem
>> 4) Has anybody else experienced this issue in a production quality
>> environment or otherwise - I've seen many posts to do with 'Client XXX
>> already connected' but nothing which resolves the issue other than  
>> 'fixed in
>> the 4.1…. Release'.  We are using 4.1.1 so we should see the fix -  
>> this
>> sounds like another issue which has slipped though the net.
>>
>> Any feedback on this would be much appreciated.
>>
>> Kind regards
>>
>> Simon Vicary
>> Integration and Technical Delivery Lead.
>> -- 
>> View this message in context: http://www.nabble.com/ActiveMQ-and- 
>> Durable-Topic-subscriptions-after-subscriber-is-uncleanly- 
>> terminated-tf4102045s2354.html#a11665143
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/ActiveMQ-and-Durable-Topic-subscriptions-after-subscriber-is-uncleanly-terminated-tf4102045s2354.html#a11708562
Sent from the ActiveMQ - User mailing list archive at Nabble.com.


Re: ActiveMQ and Durable Topic subscriptions after subscriber is uncleanly terminated

Posted by Rob Davies <ra...@gmail.com>.
Hopefully your problem can be helped with some extra configuration -  
can you tell us what you are using at the moment ?

On Jul 18, 2007, at 10:14 AM, Simon Vicary wrote:

>
> Hi all,
>
> I have been having some issues which are currently show-stoppers  
> with the
> use of Durable subscriptions with Active MQ Topics for our large scale
> integration project.
>
> I've been writing the subscriber in C#, but the issue also remains  
> for the
> Java implementation. The standard approach for establishing a durable
> subscriber is to perform all the standard steps for setting up a  
> subscriber,
> along with setting a ClientID (to provide the unique ID for the  
> subscriber
> application), and calling CreateDurableConsumer on the session.
>
> On the first attempt, the subscription is established, and the  
> messages are
> correctly received.  If the subscriber is then shutdown in a  
> controlled
> manner manor, and the connection is correctly stopped and closed, then
> subsequent restarts of the subscriber will perform as expected.  
> Great, all
> works fine.
>
> However, under real failure scenarios (machine goes pop, or goes  
> offline for
> some reason resulting in a subscriber restart) the connection  
> doesn't have
> chance to correctly terminate the connection with the broker -  
> essentially
> an "uncontrolled shutdown".  This is where the problem arises.  If the
> subscriber now attempts to establish a durable subscription with  
> the same
> ClientID and name as before, the broker returns a 'Client XXX already
> connected' error, and prevents the connection from being made -  
> even though
> the previous client/subscriber is not actually connected, or even  
> running.
> This doesn’t seem to be time bound either - even waiting for a  
> period of
> time (minutes) and retrying, will produce the same results, so it's  
> not the
> socket in TIME_WAIT state which is causing it.
>
> After further investigation, I've discovered the following:
>
> Using Jconsole to look into the state of the broker, it seems that,
> following an uncontrolled client disconnect (as previously  
> performed) the
> previously created Connection instance is still classed (by the  
> broker) as
> being both live and connected, although it blatantly isn't  
> connected (or
> even live), because the client is no longer there.  This is  
> persisted by the
> broker, and never seems to be cleared (until a broker restart,  
> which is
> unacceptable in an enterprise scale environment just to recover from a
> single subscriber failure)
>
> The broker should detect the socket disconnect from the failed  
> subscriber,
> and clean up the connection status in the broker.
>
> Another observation is that if the connection is manually cleared  
> using
> Jconsole (using the relevant operation on the connection instance),  
> the
> subscriber can indeed reconnect using the durable subscription.
>
> Another observation is that this only happens if NO messages are  
> published
> to the topic during the subscriber downtime.  If however a message is
> published to the topic during the subscriber downtime, the broker will
> detect that the subscriber is no longer live, and clear up the  
> connection.
> This results in the subscriber being able to reconnect successfully.
> However, in production environments, we cannot guarantee that a  
> message will
> be sent on a topic during the subscriber downtime - although most  
> topics
> will have high utilisation, some have low throughput - but this  
> cannot be
> relied upon, and the failure of a single durable subscription will  
> result in
> the failure of the complete subscribing application.
>
> It seems that all the ActiveMQ unit tests (or the ones I've looked  
> at) to
> test the durability of the connection, perform orderly shutdown of the
> connection during the test.  This results in the broker correctly  
> cleaning
> the connection status, and the remaining tests being successful.
>
> Under other JMS implementations (namely Tibco EMS but I've  
> performed similar
> in the past with JBossMQ), this doesn't happen.  Many JMS resources  
> specify
> that if a durable subscription is attempted and one is already  
> established,
> then the existing subscription is overwritten, and the new one is
> established.  This doesn't seem to be the case with ActiveMQ -  
> instead it
> throws an exception.
>
> My main questions to the ActiveMQ forum are:
> 1) Is there a workaround for this to allow subsequent durable  
> subscriptions
> to work following an "uncontrolled" subscriber shutdown?
> 2) Does ActiveMQ have a configuration parameter to allow subsequent  
> durable
> subscriptions to overwrite existing ones (even if the existing ones  
> are
> actually dead connections)
> 3) Is there anything within ActiveMQ which can periodically test the
> connections in the broker to see if they are still live - if not,  
> then clean
> them up to overcome this problem
> 4) Has anybody else experienced this issue in a production quality
> environment or otherwise - I've seen many posts to do with 'Client XXX
> already connected' but nothing which resolves the issue other than  
> 'fixed in
> the 4.1…. Release'.  We are using 4.1.1 so we should see the fix -  
> this
> sounds like another issue which has slipped though the net.
>
> Any feedback on this would be much appreciated.
>
> Kind regards
>
> Simon Vicary
> Integration and Technical Delivery Lead.
> -- 
> View this message in context: http://www.nabble.com/ActiveMQ-and- 
> Durable-Topic-subscriptions-after-subscriber-is-uncleanly- 
> terminated-tf4102045s2354.html#a11665143
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>