You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Paul Prescod <pr...@gmail.com> on 2010/04/06 09:11:04 UTC

How do vector clocks and conflicts work?

This may be the blind leading the blind...

On Mon, Apr 5, 2010 at 11:54 PM, Tatu Saloranta <ts...@gmail.com>wrote:
>...


> I think the key is that this is not automatic -- there is no general
> mechanism for aggregating distinct modifications. Point being that you
> could choose one amongst right answers, but not what to do with
> concurrent modifications. So what is done instead is have
> application-specific resolution strategy which makes use of semantics
> of operations, to know how to combine such concurrent modifications
> into "correct" answer.


I agree with all of that.


> I don't know if this is trivial for case of
> counter increments: especially since two concurrent increments give
> same new value; yet correct combined result would be one higher (both
> used base, added one).
>

As long as the conflict resolver knows that two writers each tried to
increment, then it can increment twice. The conflict resolver must know
about the semantics of "increment" or "decrement" or "string append" or
"binary patch" or whatever other merge strategy you choose. You'll register
your "strategy" with Cassandra and it will apply it. Presumably it will also
maintain enough context about what you were trying to accomplish to allow
the merge strategy plugin to do it properly.


> That is to say, my understanding was that vector clocks would be
> required but not sufficient for reconciliation of concurrent value
> updates.
>

I agree. They are necessary, but not sufficient. The other half is the
"merge strategy plugin" thing, which is analogous to custom comparators in
Cassandra today.

In CASSANDRA-580, Pedro Gomes talks about the plugins like this: "I suppose
for the beginning of the discussion that some sort of interface will be
implemented to allow pluggable logic to be added to the server, personalized
scripts were an idea, I have heard. "

Kevin Kakugawa replies that they'll just use Java class libraries as a first
pass.

 Paul Prescod

Re: How do vector clocks and conflicts work?

Posted by gabriele renzi <rf...@gmail.com>.
On Tue, Apr 6, 2010 at 9:11 AM, Paul Prescod <pr...@gmail.com> wrote:
> This may be the blind leading the blind...
> On Mon, Apr 5, 2010 at 11:54 PM, Tatu Saloranta <ts...@gmail.com>
> wrote:
>>...
>
>>
>> I think the key is that this is not automatic -- there is no general
>> mechanism for aggregating distinct modifications. Point being that you
>> could choose one amongst right answers, but not what to do with
>> concurrent modifications. So what is done instead is have
>> application-specific resolution strategy which makes use of semantics
>> of operations, to know how to combine such concurrent modifications
>> into "correct" answer.
>
> I agree with all of that.
>
>>
>> I don't know if this is trivial for case of
>> counter increments: especially since two concurrent increments give
>> same new value; yet correct combined result would be one higher (both
>> used base, added one).
>
> As long as the conflict resolver knows that two writers each tried to
> increment, then it can increment twice. The conflict resolver must know
> about the semantics of "increment" or "decrement" or "string append" or
> "binary patch" or whatever other merge strategy you choose. You'll register
> your "strategy" with Cassandra and it will apply it. Presumably it will also
> maintain enough context about what you were trying to accomplish to allow
> the merge strategy plugin to do it properly.

as long as operations are commutative, isn't the conflict resolution
simply "apply all" ? A large number of useful operations can be
implemented this way (numeric incr/decr, set ops etc)

Re: How do vector clocks and conflicts work?

Posted by Mike Malone <mi...@simplegeo.com>.
On Tue, Apr 6, 2010 at 11:03 AM, Tatu Saloranta <ts...@gmail.com>wrote:

> On Tue, Apr 6, 2010 at 8:45 AM, Mike Malone <mi...@simplegeo.com> wrote:
> >> As long as the conflict resolver knows that two writers each tried to
> >> increment, then it can increment twice. The conflict resolver must know
> >> about the semantics of "increment" or "decrement" or "string append" or
> >> "binary patch" or whatever other merge strategy you choose. You'll
> register
> >> your "strategy" with Cassandra and it will apply it. Presumably it will
> also
> >> maintain enough context about what you were trying to accomplish to
> allow
> >> the merge strategy plugin to do it properly.
> >>
> >>>
> >>> That is to say, my understanding was that vector clocks would be
> >>> required but not sufficient for reconciliation of concurrent value
> >>> updates.
> >
> > The way I envisioned "eventually consistent counters" working would
> require
> > something slightly more sophisticated... but not too bad. As incr/decr
> > operations happen on distributed nodes, each node would keep a (vector
> > clock, delta) tuple for that node's local changes. When a client fetched
> the
> > value of the counter the vector clock deltas and the reconciled count
> would
> > be combined into a single result. Similarly, when a replication /
> > hinted-handoff / read-repair reconciliation occurred the counts would be
> > merged into a single (vector clock, count) tuple.
> > Maybe there's a more elegant solution, but that's how I had been thinking
> > about this particular problem.
>
> I doubt there is any simple and elegant solution -- if there was, it
> would have been invented in 50s if there was. :-)
>
> Given this, yes, something along these lines sounds realistic. It also
> sounds like implementation would greatly benefit (if not require)
> foundational support from core, as opposed to being done outside of
> Cassandra (which I understand you are suggesting). I wasn't sure if
> the idea was to try to do this completely separate (aside from vector
> clock support).
>

I'd probably put it in core. Or at least put some more generic support for
this sort of conflict resolution in core. I'm looking forward to seeing
Digg's patch for this stuff.

Mike

Re: How do vector clocks and conflicts work?

Posted by Tatu Saloranta <ts...@gmail.com>.
On Tue, Apr 6, 2010 at 8:45 AM, Mike Malone <mi...@simplegeo.com> wrote:
>> As long as the conflict resolver knows that two writers each tried to
>> increment, then it can increment twice. The conflict resolver must know
>> about the semantics of "increment" or "decrement" or "string append" or
>> "binary patch" or whatever other merge strategy you choose. You'll register
>> your "strategy" with Cassandra and it will apply it. Presumably it will also
>> maintain enough context about what you were trying to accomplish to allow
>> the merge strategy plugin to do it properly.
>>
>>>
>>> That is to say, my understanding was that vector clocks would be
>>> required but not sufficient for reconciliation of concurrent value
>>> updates.
>
> The way I envisioned "eventually consistent counters" working would require
> something slightly more sophisticated... but not too bad. As incr/decr
> operations happen on distributed nodes, each node would keep a (vector
> clock, delta) tuple for that node's local changes. When a client fetched the
> value of the counter the vector clock deltas and the reconciled count would
> be combined into a single result. Similarly, when a replication /
> hinted-handoff / read-repair reconciliation occurred the counts would be
> merged into a single (vector clock, count) tuple.
> Maybe there's a more elegant solution, but that's how I had been thinking
> about this particular problem.

I doubt there is any simple and elegant solution -- if there was, it
would have been invented in 50s if there was. :-)

Given this, yes, something along these lines sounds realistic. It also
sounds like implementation would greatly benefit (if not require)
foundational support from core, as opposed to being done outside of
Cassandra (which I understand you are suggesting). I wasn't sure if
the idea was to try to do this completely separate (aside from vector
clock support).

-+ Tatu +-

Re: How do vector clocks and conflicts work?

Posted by Mike Malone <mi...@simplegeo.com>.
>
> As long as the conflict resolver knows that two writers each tried to
> increment, then it can increment twice. The conflict resolver must know
> about the semantics of "increment" or "decrement" or "string append" or
> "binary patch" or whatever other merge strategy you choose. You'll register
> your "strategy" with Cassandra and it will apply it. Presumably it will also
> maintain enough context about what you were trying to accomplish to allow
> the merge strategy plugin to do it properly.
>
>
>> That is to say, my understanding was that vector clocks would be
>> required but not sufficient for reconciliation of concurrent value
>> updates.
>
>
The way I envisioned "eventually consistent counters" working would require
something slightly more sophisticated... but not too bad. As incr/decr
operations happen on distributed nodes, each node would keep a (vector
clock, delta) tuple for that node's local changes. When a client fetched the
value of the counter the vector clock deltas and the reconciled count would
be combined into a single result. Similarly, when a replication /
hinted-handoff / read-repair reconciliation occurred the counts would be
merged into a single (vector clock, count) tuple.

Maybe there's a more elegant solution, but that's how I had been thinking
about this particular problem.

Mike