You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Alan Conway (JIRA)" <qp...@incubator.apache.org> on 2009/11/25 21:48:39 UTC

[jira] Updated: (QPID-2220) Assisign manual recovery from a complete persistent cluster crash.

     [ https://issues.apache.org/jira/browse/QPID-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Conway updated QPID-2220:
------------------------------

    Description: 
If every member of a persistent cluster crashes then manual intervention is required to identify which store is most up-to-date, so it can be used to recover. We need to provide tools to assist in this identification.

The cluster can save a config-change counter with each config change (cluster membership change). In recovery, the broker with the highest config-change counter has the best store. However if the last brokers in the cluster crash so close together that none can record a config-change we need an additional decider.

The store at http://qpidcomponents.org/download.html#persistence maintains a global Record Identifier (RID), a 64 bit value that is incremented for each enqueue and dequeue. If the cluster stores  (config-change,RID) pairs then in recovery we can use actual-RID - RID at config-change as a tiebreaker.

Proposed change to MessageStore API:
  /** Returns a monotonically increasing value reflecting the number of changes to the store.
  * The value can wrap-around to 0.
  * Stores need not implement this function, they can simply return 0.
  */
  uint64_t getChangeCounter();

The default implementation just returns 0  and the cluster must fall back to relying on config-change counts.

  was:
If every member of a persistent cluster crashes then manual intervention is required to identify which store is most up-to-date, so it can be used to recover.
We need to provide tools to assist in this identification.

The cluster can save a config-change counter with each config change. In recovery, the broker with the highest config-change counter has the best store. However if the last brokers in the cluster crash so close together that none can record a config-change we need an additional decider.

The store at http://qpidcomponents.org/download.html#persistence maintains a global counter called the RecordIdentifier (RID) that is incremented for each enqueue and dequeue. If the cluster stores  (config-change,RID) pairs then in recovery we can use actual-RID - RID at config-change as a tiebreaker.

Is it reasonable to provide access to this counter in the generic MessageStore API? Stores that don't implement it can simply return 0, and the cluster must fall back to relying on config-change counts.


> Assisign manual recovery from a complete persistent cluster crash.
> ------------------------------------------------------------------
>
>                 Key: QPID-2220
>                 URL: https://issues.apache.org/jira/browse/QPID-2220
>             Project: Qpid
>          Issue Type: Improvement
>          Components: C++ Broker
>    Affects Versions: 0.5
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>
> If every member of a persistent cluster crashes then manual intervention is required to identify which store is most up-to-date, so it can be used to recover. We need to provide tools to assist in this identification.
> The cluster can save a config-change counter with each config change (cluster membership change). In recovery, the broker with the highest config-change counter has the best store. However if the last brokers in the cluster crash so close together that none can record a config-change we need an additional decider.
> The store at http://qpidcomponents.org/download.html#persistence maintains a global Record Identifier (RID), a 64 bit value that is incremented for each enqueue and dequeue. If the cluster stores  (config-change,RID) pairs then in recovery we can use actual-RID - RID at config-change as a tiebreaker.
> Proposed change to MessageStore API:
>   /** Returns a monotonically increasing value reflecting the number of changes to the store.
>   * The value can wrap-around to 0.
>   * Stores need not implement this function, they can simply return 0.
>   */
>   uint64_t getChangeCounter();
> The default implementation just returns 0  and the cluster must fall back to relying on config-change counts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org