You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by "Kevin Yaussy (JIRA)" <ji...@apache.org> on 2006/06/15 14:43:51 UTC
[jira] Commented: (AMQ-443) ReliableTransport / KeepAlive algorithm
does not work properly.
[ https://issues.apache.org/activemq/browse/AMQ-443?page=comments#action_36397 ]
Kevin Yaussy commented on AMQ-443:
----------------------------------
Yes - and so far the 4.0 approach is working very well in this respect.
> ReliableTransport / KeepAlive algorithm does not work properly.
> ---------------------------------------------------------------
>
> Key: AMQ-443
> URL: https://issues.apache.org/activemq/browse/AMQ-443
> Project: ActiveMQ
> Type: Bug
> Components: Transport, Broker
> Versions: 3.2, 3.2.1
> Environment: Solaris 8 / 10. JDK 1.5
> Reporter: Kevin Yaussy
> Fix For: 4.0
> Attachments: KeepAliveDaemon.java, ReliableTransportChannel.java
>
>
> The current implementation of KeepAliveDaemon.java will sometimes force disconnections on well behaved connections. The problem may arrise if there is a connection which goes away, and the KeepAlive send to that channel blocks while attempting to reconnect. If this reconnection takes a while, then other channels that were responding fine may get their connections broken. This happens due to the following code in KeepAliveDaemon.java:
> if ((channel.getLastReceiptTimestamp() + channel.getKeepAliveTimeout() * 2) < System.currentTimeMillis()) {
> or
> } else if ((channel.getLastReceiptTimestamp() + channel.getKeepAliveTimeout()) < System.currentTimeMillis()) {
> The fact that the receipt timestamp is checked against System.currentTimeMillis() causes the code to break otherwise good connections. If a KeepAlive send (in examineChannel) for a broken channel takes longer than some good channel's KeepAliveTimeout, then the good connection gets broken.
> This can, in turn, cause some pretty bad behavior in the Broker. While testing and diagnosing this problem, I could some brokers in a network of brokers stuck. The sequence of events during recovery, which get interrupted due to closing the connections, would sometimes lead to the broker hanging waiting for a receipt, such as during an addConsumer (which eventually calls syncSendWithReceipt).
> I have redone the logic in KeepAliveDaemon.java (which required a small change to ReliableTransportChannel as well). This now seems to work.
> I'm a bit concerned about the blocking calls, though. This may be a different issue / bug. I thought it looked like there was a mechanism to cancel outstanding receipt waiters - but, every once in a while that mechanism would not get called. This results in the broker basically getting stuck, and does not ever really recover.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/activemq/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira