You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Martin Ritchie (JIRA)" <qp...@incubator.apache.org> on 2009/06/24 17:43:07 UTC
[jira] Updated: (QPID-1949) Client does not ensure connection is
closed before attempting failover
[ https://issues.apache.org/jira/browse/QPID-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Martin Ritchie updated QPID-1949:
---------------------------------
Component/s: Java Broker
> Client does not ensure connection is closed before attempting failover
> ----------------------------------------------------------------------
>
> Key: QPID-1949
> URL: https://issues.apache.org/jira/browse/QPID-1949
> Project: Qpid
> Issue Type: Bug
> Components: Java Broker
> Affects Versions: M4, 0.5
> Reporter: Martin Ritchie
>
> * Summary:
> * A user has reported message loss from their application. On bouncing of
> * the broker the 'lost' messages are delivered to the broker.
> *
> * Note:
> * The client was using Spring so that may influence the situation.
> *
> * Issue:
> * The log files show 7 instances of the following which result in 7
> * missing messages.
> *
> * The client log files show:
> *
> * The broker log file show:
> *
> *
> * 7 missing messages have delivery tags 5-11. Which says that they are
> * sequentially the next message from the broker.
> *
> * The only way for the 'without a handler' log to occur is if the consumer
> * has been removed from the look up table of the dispatcher.
> * And the only way for the 'null message' log to occur on the broker is is
> * if the message does not exist in the unacked-map
> *
> * The consumer is only removed from the list during session
> * closure and failover.
> *
> * If the session was closed then the broker would requeue the unacked
> * messages so the potential exists to have an empty map but the broker
> * will not send a message out after the unacked map has been cleared.
> *
> * When failover occurs the _consumer map is cleared and the consumers are
> * resubscribed. This is down without first stopping any existing
> * dispatcher so there exists the potential to receive a message after
> * the _consumer map has been cleared which is how the 'without a handler'
> * log statement occurs.
> *
> * Scenario:
> *
> * Looking over logs the sequence that best fits the events is as follows:
> * - Something causes Mina to be delayed causing the WriteTimoutException.
> * - This exception is recevied by AMQProtocolHandler#exceptionCaught
> * - As the WriteTimeoutException is an IOException this will cause
> * sessionClosed to be called to start failover.
> * + This is potentially the issues here. All IOExceptions are treated
> * as connection failure events.
> * - Failover Runs
> * + Failover assumes that the previous connection has been closed.
> * + Failover binds the existing objects (AMQConnection/Session) to the
> * new connection objects.
> * - Everything is reported as being successfully failed over.
> * However, what is neglected is that the original connection has not
> * been closed.
> * + So what occurs is that the broker sends a message to the consumer on
> * the original connection, as it was not notified of the client
> * failing over.
> * As the client failover reuses the original AMQSession and Dispatcher
> * the new messages the broker sends to the old consumer arrives at the
> * client and is processed by the same AMQSession and Dispatcher.
> * However, as the failover process cleared the _consumer map and
> * resubscribe the consumers the Dispatcher does not recognise the
> * delivery tag and so logs the 'without a handler' message.
> * - The Dispatcher then attempts to reject the message, however,
> * + The AMQSession/Dispatcher pair have been swapped to using a new Mina
> * ProtocolSession as part of the failover process so the reject is
> * sent down the second connection. The broker receives the Reject
> * request but as the Message was sent on a different connection the
> * unacknowledgemap is empty and a 'message is null' log message
> * produced.
> *
> * Test Strategy:
> *
> * It should be easy to demonstrate if we can send an IOException to
> * AMQProtocolHandler#exceptionCaught and then try sending a message.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project: http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org