You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Gordon Sim (JIRA)" <qp...@incubator.apache.org> on 2009/01/09 17:51:59 UTC

[jira] Commented: (QPID-1567) Queue replication (asynchronous) between two sites

    [ https://issues.apache.org/jira/browse/QPID-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662423#action_12662423 ] 

Gordon Sim commented on QPID-1567:
----------------------------------

==Design Notes:==

Core pattern is to have enqueue and dequeue events represented as
messages on a queue that can be consumed and processed. The aim here
is to reuse as much of the messaging platform itself.

A replication queue is defined on the primary site into which we place
events representing the enqueues and dequeues as they occur. A
federation link is then established between this replication queue and
an exchange setup on the DR site to process these events and perform
the corresponding actions.

The same approach works between clustered and unclustered brokers. In
the case of clustered brokers, a link will be terminated whenever the
node at one end of it fails and a replacement link will need to be
established.

By using acknowledgments to ensure at-least-once delivery and event
sequence numbers to detect and ignore duplicate deliveries we can
ensure that replication is reliable. (Though of course the
asynchronous nature means there is always a finite window of indoubt
repllicated events that might be lost if the primary site fails
completely).

==Implementation tasks:==

* add logic to broker::Queue to add 'event' messages to a replication queue
  for each enqueue and (optionally) dequeue

* create new exchange type to process messages representing
  enqueue/dequeue events and perform the relevant actions on the
  target queues

* implement resiliency of federation links between two clusters, such that if
  the node on either end goes down the link is re-established to/from
  another member of that cluster

* implement acknowledgements over links

==Performance impact:==

Impact of the above on general cluster throughput needs to be evaluated to determine if it's acceptable.



> Queue replication (asynchronous) between two sites
> --------------------------------------------------
>
>                 Key: QPID-1567
>                 URL: https://issues.apache.org/jira/browse/QPID-1567
>             Project: Qpid
>          Issue Type: New Feature
>          Components: C++ Broker
>    Affects Versions: M4
>            Reporter: Gordon Sim
>            Assignee: Gordon Sim
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Need to replicate queue state between two clusters (or potentially two
> non-clustered brokers); one of these is the primary site, the second
> is a disaster recovery site. They will likely be connected over a WAN.
> Two modes are required to be supported concurrently on a per queue
> basis: 
> (i) only messages flowing into the queue need to be replicated; the DR
> site will have active replicas of the consumers of such queues that
> will be receiving and consuming them.
> (ii) full queue state needs to be replicated, i.e. both the enqueing
> and dequeuing of messages on the primary site needs to be reflected in
> the DRs replica of the queue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.