You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org> on 2012/05/01 15:13:49 UTC

[jira] [Commented] (QPID-3963) A federated broker may not reconnect to a remote cluster on link failure.

    [ https://issues.apache.org/jira/browse/QPID-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265793#comment-13265793 ] 

jiraposter@reviews.apache.org commented on QPID-3963:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4846/#review7420
-----------------------------------------------------------

Ship it!


- Alan


On 2012-04-26 19:19:31, Kenneth Giusti wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4846/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-04-26 19:19:31)
bq.  
bq.  
bq.  Review request for qpid, Alan Conway and Gordon Sim.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Still a WIP, but I wanted early feedback as I'm not too experienced with the subscription management code involved (completely stolen from Alan).
bq.  
bq.  This patch allows the Link to subscribe to the remote broker's amq.failover exchange - if it exists.  This allows the Link to be updated dynamically should the remote broker be part of a cluster, and the cluster membership changes.
bq.  
bq.  Light testing against a cluster confirms that this patch fixes qpid-3963.  Testing against a non-cluster remote causes the remote to log the following error, but otherwise behaves ok:
bq.  
bq.  2012-04-23 16:45:27 error Execution exception: not-found: Exchange not found: amq.failover (../../../qpid/cpp/src/qpid/broker/ExchangeRegistry.cpp:101)
bq.  
bq.  
bq.  This addresses bug qpid-3963.
bq.      https://issues.apache.org/jira/browse/qpid-3963
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    /trunk/qpid/cpp/xml/cluster.xml 1329301 
bq.    /trunk/qpid/cpp/src/qpid/cluster/Connection.h 1329301 
bq.    /trunk/qpid/cpp/src/qpid/cluster/Connection.cpp 1329301 
bq.    /trunk/qpid/cpp/src/qpid/cluster/UpdateClient.cpp 1329301 
bq.    /trunk/qpid/cpp/src/qpid/broker/LinkRegistry.cpp 1329301 
bq.    /trunk/qpid/cpp/src/qpid/broker/Link.cpp 1329301 
bq.    /trunk/qpid/cpp/src/qpid/broker/ExchangeRegistry.cpp 1329301 
bq.    /trunk/qpid/cpp/src/qpid/broker/Link.h 1329301 
bq.  
bq.  Diff: https://reviews.apache.org/r/4846/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  minimal.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Kenneth
bq.  
bq.


                
> A federated broker may not reconnect to a remote cluster on link failure.
> -------------------------------------------------------------------------
>
>                 Key: QPID-3963
>                 URL: https://issues.apache.org/jira/browse/QPID-3963
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker, C++ Clustering
>    Affects Versions: 0.14
>            Reporter: Ken Giusti
>            Assignee: Ken Giusti
>
> When a broker is federated with a cluster, the cluster informs the broker of the failover addresses that are valid for the cluster.  Should a cluster member fail, the broker will reconnect to another member of that cluster.
> However, the federated broker only queries the cluster for these failover addresses when it first connects to the cluster.  Should the cluster topology change, the federated broker's list of available failover addresses will become out-of-date.  This can prevent the broker from correctly re-connecting on failure of a cluster member.
> Example:
> Given cluster with members C1 and C2, and a separate broker B, federate B to connect to C1.   On connecting to C1, B learns the addresses of C2 as an alternate failover address.  Now shutdown C1.  B will reconnect to C2, and learn that C2 is the only member of the cluster (ie. no failover addresses).   After B connects, restart C1 and let it join the cluster.  Then shutdown C2.   Since B does not know that C1 has become available again, B will not attempt to re-connect to it.  Instead, it tries to reconnect to C2 indefinately.
> The expected behavior would be to have B reconnect to C1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org