You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@activemq.apache.org by zdvickery <za...@sensus.com> on 2010/01/05 23:23:29 UTC

ExceptionListener behavior in Connection class and NMS 1.2.0

I am currently performing some asychronous messaging functionality using
ActiveMQ/NMS.  For various reasons, my process is connected to an ActiveMQ
server which restarts periodically.  I have observed the following behaviors
with different versions of NMS:

1.0: The listener would fire most of the time when the connection was lost
(sometimes it wouldn't, which initiated the search for a fix)
1.1: The process would deadlock within NMS code when the connection was lost
1.2-RC2: No event is fired when the connection is lost

With 1.2-RC2, I can close the TCP connection used by my publishers and
subscribers, yet the ExceptionListener callback is never invoked.  I am
curious if this is expected behavior and whether there is anything that can
be done to reliabily detect connection failures?
-- 
View this message in context: http://old.nabble.com/ExceptionListener-behavior-in-Connection-class-and-NMS-1.2.0-tp27026733p27026733.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: ExceptionListener behavior in Connection class and NMS 1.2.0

Posted by Timothy Bish <ta...@gmail.com>.

On Thu, 2010-01-07 at 11:19 -0800, zdvickery wrote:
> 
> Timothy Bish wrote:
> > 
> > If you think you've found an issue then by all means open a new Jira
> > issue.  We'd need a good description of the problem along with steps to
> > reproduce it and a sample app that demonstrates the problem if
> > possible.  
> > 
> > When you say the TCP connect is getting closed reset, who exactly is
> > closing / resetting the connection?
> > 
> > Regards
> > Tim.
> > 
> 
> The issue in my mind is to understand "what is the correct behavior of these
> events?" and then to ensure the library meets the specification.  Right now,
> what these events do seems to predicated on the use of failover transport,
> whether it is associated with a publisher or subscriber, and the NMS
> version.  It's more of a specification effort than anything; I'm not sure if
> that's what you want logged in Jira.

I would expect that if the Connection is broken between the NMS client
and the Broker and it were actually detected (FIN packet sent for
instance) than if you had an ExceptionListener connected to the
Connection instance that you'd see an onException callback.  Without
looking into the code it sounds like a bug, but I'd need to see your
client code to know for sure that things are setup correctly.  If you
create an issue and attach a sample app that you think should be
receiving the exception notification when the broker is killed for
instance I can look into it.  In cases where there's no FIN packet then
all bets are off, it could hang around forever unless you do a send or
have the inactivity monitor enabled.

> 
> As far as the connection is concerned, there are two scenarios which seem to
> produce identical behavior in the NMS library.  I haven't sniffed them to
> determine if they are being closed with a FIN or RST:
> - ActiveMQ is restarted
> - SSH tunnel used for transport is closed

Regards
Tim.

Re: ExceptionListener behavior in Connection class and NMS 1.2.0

Posted by zdvickery <za...@sensus.com>.

Timothy Bish wrote:
> 
> If you think you've found an issue then by all means open a new Jira
> issue.  We'd need a good description of the problem along with steps to
> reproduce it and a sample app that demonstrates the problem if
> possible.  
> 
> When you say the TCP connect is getting closed reset, who exactly is
> closing / resetting the connection?
> 
> Regards
> Tim.
> 

The issue in my mind is to understand "what is the correct behavior of these
events?" and then to ensure the library meets the specification.  Right now,
what these events do seems to predicated on the use of failover transport,
whether it is associated with a publisher or subscriber, and the NMS
version.  It's more of a specification effort than anything; I'm not sure if
that's what you want logged in Jira.

As far as the connection is concerned, there are two scenarios which seem to
produce identical behavior in the NMS library.  I haven't sniffed them to
determine if they are being closed with a FIN or RST:
- ActiveMQ is restarted
- SSH tunnel used for transport is closed
-- 
View this message in context: http://old.nabble.com/ExceptionListener-behavior-in-Connection-class-and-NMS-1.2.0-tp27026733p27065498.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: ExceptionListener behavior in Connection class and NMS 1.2.0

Posted by Timothy Bish <ta...@gmail.com>.

On Thu, 2010-01-07 at 10:52 -0800, zdvickery wrote:
> 
> Timothy Bish wrote:
> > 
> > I'd need some more information on your setup to give you a definite
> > answer.  Are you using the Failover transport?  If so then the exception
> > listener is not notified if the connection is dropped, the Failover
> > transport just sits in the background attempting to reconnect.
> > 
> > In NMS.ActiveMQ 1.2.0-RC2 there is an Inactivity Monitor in the
> > transport stack that can be enabled, its new and still undergoing some
> > testing so I have disabled it by default, you can enable it by adding
> > transport.useInactivityMonitor to your connection URI's options, like
> > this:
> > 
> > 	"tcp://127.0.0.1:61616?transport.useInactivityMonitor=true"
> > 
> > For the non-Failover transport case when the TCP connection is broken
> > and finally detected it should cause an exception to fire from the
> > Exception listener.  Note however that without the inactivity monitor
> > enabled it could take some time for that to happen, I think the keep
> > alive on TCP connections in windows is something like an hour or so.
> > 
> > Regards
> > Tim.
> > 
> 
> All of my initial testing was performed without the failover transport
> (which I actually didn't know about until just now).  With that said, I
> observe the following behaviors with 1.2-RC2:
> 
> No failover transport, no inactivity monitor: Exception listener and
> connection interruped listener never fires (even days later)
> 
> No failover transport, inactivity monitor enabled: A connection interruped
> event fires on the publisher but no event fires on the subscriber.  However
> all connections are restored.  This seems to behave very similarly (if not
> identically) to the failover transport scenario below.
> 
> Failover transport, no inactivity monitor: A connection interruped event
> fires on the publisher only and all connections are restored per the backoff
> in the failover transport spec
> 
> In summary, things work as expected with the failover transport and that is
> probably how 99% of developers handle this scenario.  That is a good
> solution to my particular problem and I am very happy to discover it. 
> However if failover transport is not used (either deliberately or out of
> ignorance), the inactivity monitor seems necessary to obtain reasonable
> reconnect behavior.  However it seems that in at least the asychronous case
> the event behavior makes it difficult (impossible?) for the application to
> actually figure out if there is a connection problem, particularly on the
> subscriber side.  In addition, the events behave very differently between
> NMS versions.
> 
> I'm not sure if it is worth the effort to pursue this further, but if so, it
> would be good to start with the events and what they should be doing.  For
> instance, I'm still unclear whether a TCP connection getting closed/reset
> should fire a "connection interruped" event or a "connection exception"
> event.  I can create a bug report for this if desired.

If you think you've found an issue then by all means open a new Jira
issue.  We'd need a good description of the problem along with steps to
reproduce it and a sample app that demonstrates the problem if
possible.  

When you say the TCP connect is getting closed reset, who exactly is
closing / resetting the connection?

Regards
Tim.


-- 
Tim Bish
http://fusesource.com
http://timbish.blogspot.com/

Re: ExceptionListener behavior in Connection class and NMS 1.2.0

Posted by zdvickery <za...@sensus.com>.

Timothy Bish wrote:
> 
> I'd need some more information on your setup to give you a definite
> answer.  Are you using the Failover transport?  If so then the exception
> listener is not notified if the connection is dropped, the Failover
> transport just sits in the background attempting to reconnect.
> 
> In NMS.ActiveMQ 1.2.0-RC2 there is an Inactivity Monitor in the
> transport stack that can be enabled, its new and still undergoing some
> testing so I have disabled it by default, you can enable it by adding
> transport.useInactivityMonitor to your connection URI's options, like
> this:
> 
> 	"tcp://127.0.0.1:61616?transport.useInactivityMonitor=true"
> 
> For the non-Failover transport case when the TCP connection is broken
> and finally detected it should cause an exception to fire from the
> Exception listener.  Note however that without the inactivity monitor
> enabled it could take some time for that to happen, I think the keep
> alive on TCP connections in windows is something like an hour or so.
> 
> Regards
> Tim.
> 

All of my initial testing was performed without the failover transport
(which I actually didn't know about until just now).  With that said, I
observe the following behaviors with 1.2-RC2:

No failover transport, no inactivity monitor: Exception listener and
connection interruped listener never fires (even days later)

No failover transport, inactivity monitor enabled: A connection interruped
event fires on the publisher but no event fires on the subscriber.  However
all connections are restored.  This seems to behave very similarly (if not
identically) to the failover transport scenario below.

Failover transport, no inactivity monitor: A connection interruped event
fires on the publisher only and all connections are restored per the backoff
in the failover transport spec

In summary, things work as expected with the failover transport and that is
probably how 99% of developers handle this scenario.  That is a good
solution to my particular problem and I am very happy to discover it. 
However if failover transport is not used (either deliberately or out of
ignorance), the inactivity monitor seems necessary to obtain reasonable
reconnect behavior.  However it seems that in at least the asychronous case
the event behavior makes it difficult (impossible?) for the application to
actually figure out if there is a connection problem, particularly on the
subscriber side.  In addition, the events behave very differently between
NMS versions.

I'm not sure if it is worth the effort to pursue this further, but if so, it
would be good to start with the events and what they should be doing.  For
instance, I'm still unclear whether a TCP connection getting closed/reset
should fire a "connection interruped" event or a "connection exception"
event.  I can create a bug report for this if desired.
-- 
View this message in context: http://old.nabble.com/ExceptionListener-behavior-in-Connection-class-and-NMS-1.2.0-tp27026733p27065051.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: ExceptionListener behavior in Connection class and NMS 1.2.0

Posted by Timothy Bish <ta...@gmail.com>.

On Tue, 2010-01-05 at 14:23 -0800, zdvickery wrote:
> I am currently performing some asychronous messaging functionality using
> ActiveMQ/NMS.  For various reasons, my process is connected to an ActiveMQ
> server which restarts periodically.  I have observed the following behaviors
> with different versions of NMS:
> 
> 1.0: The listener would fire most of the time when the connection was lost
> (sometimes it wouldn't, which initiated the search for a fix)
> 1.1: The process would deadlock within NMS code when the connection was lost
> 1.2-RC2: No event is fired when the connection is lost
> 
> With 1.2-RC2, I can close the TCP connection used by my publishers and
> subscribers, yet the ExceptionListener callback is never invoked.  I am
> curious if this is expected behavior and whether there is anything that can
> be done to reliabily detect connection failures?

I'd need some more information on your setup to give you a definite
answer.  Are you using the Failover transport?  If so then the exception
listener is not notified if the connection is dropped, the Failover
transport just sits in the background attempting to reconnect.

In NMS.ActiveMQ 1.2.0-RC2 there is an Inactivity Monitor in the
transport stack that can be enabled, its new and still undergoing some
testing so I have disabled it by default, you can enable it by adding
transport.useInactivityMonitor to your connection URI's options, like
this:

	"tcp://127.0.0.1:61616?transport.useInactivityMonitor=true"

For the non-Failover transport case when the TCP connection is broken
and finally detected it should cause an exception to fire from the
Exception listener.  Note however that without the inactivity monitor
enabled it could take some time for that to happen, I think the keep
alive on TCP connections in windows is something like an hour or so.

Regards
Tim.