You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Martin Ritchie (JIRA)" <qp...@incubator.apache.org> on 2009/06/24 17:43:07 UTC

[jira] Updated: (QPID-1949) Client does not ensure connection is closed before attempting failover

     [ https://issues.apache.org/jira/browse/QPID-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Ritchie updated QPID-1949:
---------------------------------

    Component/s: Java Broker

> Client does not ensure connection is closed before attempting failover
> ----------------------------------------------------------------------
>
>                 Key: QPID-1949
>                 URL: https://issues.apache.org/jira/browse/QPID-1949
>             Project: Qpid
>          Issue Type: Bug
>          Components: Java Broker
>    Affects Versions: M4, 0.5
>            Reporter: Martin Ritchie
>
> * Summary:
>  * A user has reported message loss from their application. On bouncing of
>  * the broker the 'lost' messages are delivered to the broker.
>  *
>  * Note:
>  * The client was using Spring so that may influence the situation.
>  *
>  * Issue:
>  * The log files show 7 instances of the following which result in 7
>  * missing messages.
>  *
>  * The client log files show:
>  *
>  * The broker log file show:
>  *
>  *
>  * 7 missing messages have delivery tags 5-11. Which says that they are
>  * sequentially the next message from the broker.
>  *
>  * The only way for the 'without a handler' log to occur is if the consumer
>  * has been removed from the look up table of the dispatcher.
>  * And the only way for the 'null message' log to occur on the broker is is
>  * if the message does not exist in the unacked-map
>  *
>  * The consumer is only removed from the list during session
>  * closure and failover.
>  *
>  * If the session was closed then the broker would requeue the unacked
>  * messages so the potential exists to have an empty map but the broker
>  * will not send a message out after the unacked map has been cleared.
>  *
>  * When failover occurs the _consumer map is cleared and the consumers are
>  * resubscribed. This is down without first stopping any existing
>  * dispatcher so there exists the potential to receive a message after
>  * the _consumer map has been cleared which is how the 'without a handler'
>  * log statement occurs.
>  *
>  * Scenario:
>  *
>  * Looking over logs the sequence that best fits the events is as follows:
>  * - Something causes Mina to be delayed causing the WriteTimoutException.
>  * - This exception is recevied by AMQProtocolHandler#exceptionCaught
>  * - As the WriteTimeoutException is an IOException this will cause
>  * sessionClosed to be called to start failover.
>  * + This is potentially the issues here. All IOExceptions are treated
>  * as connection failure events.
>  * - Failover Runs
>  * + Failover assumes that the previous connection has been closed.
>  * + Failover binds the existing objects (AMQConnection/Session) to the
>  * new connection objects.
>  * - Everything is reported as being successfully failed over.
>  * However, what is neglected is that the original connection has not
>  * been closed.
>  * + So what occurs is that the broker sends a message to the consumer on
>  * the original connection, as it was not notified of the client
>  * failing over.
>  * As the client failover reuses the original AMQSession and Dispatcher
>  * the new messages the broker sends to the old consumer arrives at the
>  * client and is processed by the same AMQSession and Dispatcher.
>  * However, as the failover process cleared the _consumer map and
>  * resubscribe the consumers the Dispatcher does not recognise the
>  * delivery tag and so logs the 'without a handler' message.
>  * - The Dispatcher then attempts to reject the message, however,
>  * + The AMQSession/Dispatcher pair have been swapped to using a new Mina
>  * ProtocolSession as part of the failover process so the reject is
>  * sent down the second connection. The broker receives the Reject
>  * request but as the Message was sent on a different connection the
>  * unacknowledgemap is empty and a 'message is null' log message
>  * produced.
>  *
>  * Test Strategy:
>  *
>  * It should be easy to demonstrate if we can send an IOException to
>  * AMQProtocolHandler#exceptionCaught and then try sending a message.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org