You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Kelvin Kakugawa (JIRA)" <ji...@apache.org> on 2009/11/24 21:09:39 UTC

[jira] Created: (CASSANDRA-580) vector clock support

vector clock support
--------------------

                 Key: CASSANDRA-580
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
             Project: Cassandra
          Issue Type: New Feature
          Components: Core
         Environment: N/A
            Reporter: Kelvin Kakugawa


Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment:     (was: 580-interface-1-add-vector-clock.diff)

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7
>
>         Attachments: 580-context-v4.patch, 580-thrift-v3.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793938#action_12793938 ] 

Kelvin Kakugawa edited comment on CASSANDRA-580 at 12/23/09 5:56 AM:
---------------------------------------------------------------------

The patch adds a new package:
db/context

which contains:
IContext
VersionVectorContext

IContext is just a general interface to manipulate context byte[].  Basically, create(), update() and reconcile().

create(): create a new/empty context.
update(): update this context w/ the local node's id.
reconcile(): pass in a list of context-value pairs and it'll return a merged context (that supersedes all the passed in contexts) and a list of values that it couldn't automatically reconcile.

VersionVectorContext is a version vector implementation.

The VV format is a concatenated list of node id (IPv4's 4 bytes), count (int), and timestamp (long) tuples in a byte[].

create(): returns an empty byte[].
update(): will look for the local node's tuple in the byte[], increment its count, then prepend it to the front of the byte[] w/ an updated timestamp.  So, that the byte[] is always in timestamp descending order.
reconcile(): looks for all disjoint (incompatible) VVs and collapses all VVs that are a subset of another VV in the list.  (implementation note: if 2 VVs are equal, but their values are not equivalent, both values will be added to the set of values that need to be manually reconciled.  It seems inefficient, though, so when I go through the rest of the system, I'm going to see if I can avoid this check.  Since, it's a problem that can only happen on the local node.)

VersionVectorContext helper methods of interest:
compareContexts(): sorts contexts by id, then steps through both contexts to determine pairwise: equality, superset, subset, disjoint.
mergeContexts(): creates a map from node id to count-timestamp pairs, then create a timestamp-sorted array and pulls off up to the max entries to form the new merged context.


      was (Author: kelvin):
    The patch adds a new package:
db/context

which contains:
IContext
VersionVectorContext

IContext is just a general interface to manipulate context byte[].  Basically, create(), update() and reconcile().

create(): create a new/empty context.
update(): update this context w/ the local node's id.
reconcile(): pass in a list of context-value pairs and it'll return a merged context (that supersedes all the passed in contexts) and a list of values that it couldn't automatically reconcile.

VersionVectorContext is a version vector implementation.

The VV format is a concatenated list of node id (IPv4's 4 bytes), count (int), and timestamp (long) tuples in a byte[].

create(): returns an empty byte[].
update(): will look for the local node's tuple in the byte[], increment its count, then prepend it to the front of the byte[] w/ an updated timestamp.  So, that the byte[] is always in timestamp descending order.
reconcile(): looks for all disjoint (incompatible) VVs and collapses all VVs that are a subset of another VV in the list.  (implementation note: if 2 VVs are equal, but they're values are not equivalent, both values will be added to the set of values that need to be manually reconciled.  It seems inefficient, though, so when I go through the rest of the system, I'm going to see if I can avoid this check.  Since, it's a problem that can only happen on the local node.)

VersionVectorContext helper methods of interest:
compareContexts(): sorts contexts by id, then steps through both contexts to determine pairwise: equality, superset, subset, disjoint.
mergeContexts(): creates a map from node id to count-timestamp pairs, then create a timestamp-sorted array and pulls off up to the max entries to form the new merged context.

  
> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784497#action_12784497 ] 

Jonathan Ellis commented on CASSANDRA-580:
------------------------------------------

That makes sense, thanks.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment: 580-interface-1-add-vector-clock.diff

proposed vector clock interface

The vector clock is returned by gets and required for insert/remove.  I assume that server-side conflict resolution will be implemented, so I only pass back the definitive version of the row.  If client-side conflict resolution will be implemented in the future, another data structure will be needed that encapsulates a list of conflicts and, maybe, a summary vector clock to be used for an insert/remove operation that resolves the conflict.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment:     (was: 580-interface-2-add-vector-clock.diff)

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7
>
>         Attachments: 580-context-v4.patch, 580-thrift-v3.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796416#action_12796416 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

np, I'll clean up the API.  i.e. make internal methods private and move public methods to FBUtilities.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment: 580-thrift-v4.patch

modified interface to use opaque context for versioning

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804753#action_12804753 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

np, I'll remove the historical patches.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7
>
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-context-v4.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Pedro Gomes (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796756#action_12796756 ] 

Pedro Gomes commented on CASSANDRA-580:
---------------------------------------

I have been following this issue for some time now and I'm curious how will you deal with reconciliation.
I suppose for the beginning of the discussion that some sort of interface will be implemented to allow pluggable logic to be added to the server, personalized scripts were an idea, I have heard.

Is something planned? As a suggestion, Java Scripting API and scripts stored on Cassandra?
Sorry if I seem hasty, but I will probably work on some dummy implementation as I need this for my work, and I didn't want to diverge from a future release of Cassandra. 
If you can give me some lights, it would be thankful.    

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784493#action_12784493 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

Ah, I see what you were going after.  That's a more concise interface change.

My understanding about including a timestamp along w/ the logical clock is to limit the potential growth of the vector.  Basically, whenever a new node updates a value, the vector size grows by one.  However, the problem is that if many different nodes happen to update a given value (for various reasons--failure scenarios, etc.), the potential size of a vector could grow to an unmanageable length and it would keep that length forever.  So, the Dynamo authors chose to tag each update with a timestamp, so they could truncate the vector to only the last 10 nodes to update the value.  There is a possibility that an inconsistency could arise, because of the truncation.  However, the paper said in practice, it was a non-issue.

In summary, the timestamp is not there to help resolve consistency problems, it's there to make the vector more manageable.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784508#action_12784508 ] 

Stu Hood commented on CASSANDRA-580:
------------------------------------

Regarding the timestamp being necessary in a version vector: have you looked at Interval Tree Clocks? The paper is slightly over my head, but the algorithm is supposed to be a generalization of version vectors / vector clocks, and it has a natural solution for changing members: http://en.wikipedia.org/wiki/Version_vector#cite_note-5

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment:     (was: 580-thrift-v4.patch)

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7
>
>         Attachments: 580-context-v4.patch, 580-thrift-v3.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment: 580-context-v4.patch

Moved utility methods in VersionVectorContext to FBUtilities.
Modified impl to use System.arraycopy().
Modified internal methods to be protected/private (depending on whether necessary for testing).
Removed now unused methods.

Note: FBUtilities.compareByteSubArrays() will throw an IllegalArgumentException if a length is passed in that extends past either array.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-context-v4.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796805#action_12796805 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

For the initial server-side reconciliation, we'll probably adopt an approach similar to the way custom ColumnFamily comparators are specified.

It might help to refer to:
http://wiki.apache.org/cassandra/StorageConfiguration

In particular, the section called, "Keyspaces and ColumnFamilies," where they discuss compareWith and how you can extend org.apache.cassandra.db.marshal.AbstractType to use your own ColumnFamily comparator.

For the initial implementation to get this out the door, we won't support a javascript API.  However, I'm open to suggestions.  It probably would be nice to support a high-level scripting language for certain components.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782286#action_12782286 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

Right now, I'm leaning towards client-side conflict resolution.

Basically, all updates are written out and conflict resolution is handled at read time.  An exception being a version in the Memtable that can be resolved syntactically.  However, it would require more copies of the data and a more complex API.  It would make the storage system more flexible for end users, though, since they wouldn't have to write server-side logic.  However, they would have to parse a list of conflicting versions and pass back a context/summary version vector of the merged conflict.

My reasoning is that Cassandra is write-optimized, so we should shift the burden to reads rather than writes.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782290#action_12782290 ] 

Jonathan Ellis commented on CASSANDRA-580:
------------------------------------------

As Stu points out, if the client has to resolve writes, you can no longer compact without involving the client.  This is a big big lose.  +1 pluggable server-side conflict resolution from me.

(This doesn't have to be complicated; just allow a class name to be specified per-CF like we do for CompareWith.)

Also, I think you can make a good case that this is a better stylistic fit for Cassandra, which tries to support "dumb" clients more than Dynamo did.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment: 580-interface-2-add-vector-clock.diff

rename vector remove to vector_remove (for consistency).

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment: 580-thrift-v6.patch

Modify Deletion to use Clock, instead of i64 timestamp.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment: 580-context-v2.patch

minor modification:
added toString() method in IContext to create a human-readable string from a given context.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment:     (was: 580-context-v3.patch)

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7
>
>         Attachments: 580-context-v4.patch, 580-thrift-v3.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782294#action_12782294 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

I'll take Stu and Jonathan's advice and restrict the scope of this ticket to server-side conflict resolution.

If client-side resolution is interesting, we can pursue it in a later ticket.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784638#action_12784638 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

Thanks Stu for the lead.

I studied the Interval Tree Clocks paper and it looks promising.  Let me work out the algorithm, so that we can investigate its implementation in Cassandra.

Having said the above, I think we should re-do the "clock" aspect of the interface and make it an opaque context object (like Dynamo).  We pass it out to clients on a read, and when they update a given value they pass back the context that they're updating.  I'm not sure if we want to extend the concept so far as to make existing timestamps just a special case of an opaque context, though.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment:     (was: 580-context-v1.patch)

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7
>
>         Attachments: 580-context-v4.patch, 580-thrift-v3.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784639#action_12784639 ] 

Jonathan Ellis commented on CASSANDRA-580:
------------------------------------------

making it a byte[] context makes sense.

replacing the existing timestamps with that does not, since they are client-provided by design which is the opposite of the VC context.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784498#action_12784498 ] 

Jonathan Ellis commented on CASSANDRA-580:
------------------------------------------

... but shouldn't that timestamp be internal to the server?  we definitely don't want the client to specify what amounts to an implementation detail, and i don't think there is any reason to send it to the client on a read, either.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jaakko Laine (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796120#action_12796120 ] 

Jaakko Laine commented on CASSANDRA-580:
----------------------------------------

Read through the patch, following comments/questions: 

(1) Utility methods should be put to FBUtilities 

(2) compareByteSubArrays seems to have following issues: (a) it does not handle the case where bytes2 is null and bytes1 is non-null (b) it does not handle the case where parameter 'length' is bigger than byte array length 

(3) bytes are copied in manual for loops. Is there a reason for not using arraycopy? It would perhaps make the code slightly cleaner and faster (don't know about the latter though, as this involves copying elements within the same array) 

(4) Would it make sense to use some level of data abstraction? Sorting and comparisons always involve copying & handling of individual bytes, which makes the code slightly cumbersome to read, and might be more inefficient than using object references as well (again, not sure about this, have to do some research). 


> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment:     (was: 580-thrift-v5.patch)

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7
>
>         Attachments: 580-context-v4.patch, 580-thrift-v3.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787107#action_12787107 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

Honestly, I'm starting to lean that way, as well.  If anything, we can make it a configuration setting, in the future.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa reassigned CASSANDRA-580:
-----------------------------------------

    Assignee: Kelvin Kakugawa

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782276#action_12782276 ] 

Stu Hood commented on CASSANDRA-580:
------------------------------------

One important difference between Cassandra and some other systems utilizing vector clocks is that in Cassandra, writes do not read from disk any old versions of the value that is being written to. This means that conflict resolution currently happens in 3 different places:
 1. At read time - Two nodes have different versions,
 2. At write time - The version being written is older than the version a node has in a Memtable,
 3. At compaction time - Versions of values persisted to different SSTables disagree.

NB: For the purposes of this ticket, I think that all resolution should be handled server side, deterministically, and that one of the following options should be implemented as part of a separate ticket.

But before too much progress is made, we will probably want to decide whether we want to support:
 a) Client side conflict resolution (logic implemented on the client side),
 b) Server side resolution (pluggable logic on the server),
 c) A hybrid (pluggable resolution, which can optionally sends the versions to the client at resolution time #1 or #2)

If we decide to implement client-side resolution (a), then we will need to remove resolution at steps #2 and #3 (or make it optional for option (c)), and keep more copies of the data. For #3, a Memtable could store conflicting versions in memory until they are resolved by a read or flushed to disk. For #2, SSTables will need to be able to store multiple versions of a row/cf until they are resolved by a read.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793940#action_12793940 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

Nomenclature: When I use the terms superset/subset/disjoint, I really mean: dominates, dominated, neither dominated nor dominates.  However, it's easier for me to read the code w/ standard set terms.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment: 580-thrift-v5.patch

use binary for context

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment:     (was: 580-context-v2.patch)

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7
>
>         Attachments: 580-context-v4.patch, 580-thrift-v3.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment: 580-context-v3.patch

minor modification:
add comments to reconcile()'s param and return value

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787092#action_12787092 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

I've been talking w/ the authors of the interval tree clocks (ITC) paper about how to apply ITC to Cassandra, and it looks like we may need to modify the ITC algorithm for our use-case.

The crux of the matter is Cassandra's hinted hand-off feature.  The ITC algorithm composes an id-tree and event-tree to represent the version of a given value.  The id-tree is a nice way to create unique ids on-the-fly for any node (by splitting the id-tree, as necessary) and the event-tree represents causality.  However, the problem is that for a node to update the event-tree for a value, it has to be assigned a part of the id-tree beforehand.

A short example, follows:
If a node tries to forward a value, but (because of failure scenarios) it has to store the value, locally.  It wouldn't be able to update the version of the value, unless it had been assigned a part of the id-tree beforehand from the set of nodes responsible for the value.

The authors have a couple of solutions:
1) Split the id-tree between all nodes in the cluster from the very start.  This solves the problem, but it does mute the attractive benefits of ITC over traditional version vectors.  i.e. dynamically partitioning the id space at run-time and only to the extent necessary to conserve space.
2) On client reads, doing a "fork" instead of a "peek" and sharing the id-tree w/ the client.  However, this is a more complicated approach that may need to be worked out some more.

In any case, since we're using an opaque context, these decisions won't affect the interface.  However, it's an interesting implementation concern.  Depending on the average size of a Cassandra cluster, it may or may not be worth pre-forking the id-tree to all nodes from the very start.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784502#action_12784502 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

I think you're right about the timestamps not necessarily being useful to a client.  It shouldn't be part of the interface, since it's really an implementation detail to manage the size of the vector.  I'll make it internal to the server.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-580) vector clock support

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782276#action_12782276 ] 

Stu Hood edited comment on CASSANDRA-580 at 11/25/09 2:06 AM:
--------------------------------------------------------------

One important difference between Cassandra and some other systems utilizing vector clocks is that in Cassandra, writes do not read from disk any old versions of the value that is being written to. This means that conflict resolution currently happens in 3 different places:
 1. At read time - Two nodes have different versions,
 2. At write time - The version being written is older than the version a node has in a Memtable,
 3. At compaction time - Versions of values persisted to different SSTables disagree.

NB: For the purposes of this ticket, I think that all resolution should be handled server side, deterministically, and that one of the following options should be implemented as part of a separate ticket.

But before too much progress is made, we will probably want to decide whether we want to support:
 a) Client side conflict resolution (logic implemented on the client side),
 b) Server side resolution (pluggable logic on the server),
 c) A hybrid (pluggable resolution, which can optionally sends the versions to the client at resolution time #1 or #2)

If we decide to implement client-side resolution (a), then we will need to remove resolution at steps #2 and #3 (or make it optional for option (c)), and keep more copies of the data. For #2, a Memtable could store conflicting versions in memory until they are resolved by a read or flushed to disk. For #3, SSTables will need to be able to store multiple versions of a row/cf until they are resolved by a read.

      was (Author: stuhood):
    One important difference between Cassandra and some other systems utilizing vector clocks is that in Cassandra, writes do not read from disk any old versions of the value that is being written to. This means that conflict resolution currently happens in 3 different places:
 1. At read time - Two nodes have different versions,
 2. At write time - The version being written is older than the version a node has in a Memtable,
 3. At compaction time - Versions of values persisted to different SSTables disagree.

NB: For the purposes of this ticket, I think that all resolution should be handled server side, deterministically, and that one of the following options should be implemented as part of a separate ticket.

But before too much progress is made, we will probably want to decide whether we want to support:
 a) Client side conflict resolution (logic implemented on the client side),
 b) Server side resolution (pluggable logic on the server),
 c) A hybrid (pluggable resolution, which can optionally sends the versions to the client at resolution time #1 or #2)

If we decide to implement client-side resolution (a), then we will need to remove resolution at steps #2 and #3 (or make it optional for option (c)), and keep more copies of the data. For #3, a Memtable could store conflicting versions in memory until they are resolved by a read or flushed to disk. For #2, SSTables will need to be able to store multiple versions of a row/cf until they are resolved by a read.
  
> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796413#action_12796413 ] 

Jonathan Ellis commented on CASSANDRA-580:
------------------------------------------

(It's okay to have helper methods that are only used in one class local to that class, but in that case they should be private.)

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793938#action_12793938 ] 

Kelvin Kakugawa edited comment on CASSANDRA-580 at 12/23/09 5:55 AM:
---------------------------------------------------------------------

The patch adds a new package:
db/context

which contains:
IContext
VersionVectorContext

IContext is just a general interface to manipulate context byte[].  Basically, create(), update() and reconcile().

create(): create a new/empty context.
update(): update this context w/ the local node's id.
reconcile(): pass in a list of context-value pairs and it'll return a merged context (that supersedes all the passed in contexts) and a list of values that it couldn't automatically reconcile.

VersionVectorContext is a version vector implementation.

The VV format is a concatenated list of node id (IPv4's 4 bytes), count (int), and timestamp (long) tuples in a byte[].

create(): returns an empty byte[].
update(): will look for the local node's tuple in the byte[], increment its count, then prepend it to the front of the byte[] w/ an updated timestamp.  So, that the byte[] is always in timestamp descending order.
reconcile(): looks for all disjoint (incompatible) VVs and collapses all VVs that are a subset of another VV in the list.  (implementation note: if 2 VVs are equal, but they're values are not equivalent, both values will be added to the set of values that need to be manually reconciled.  It seems inefficient, though, so when I go through the rest of the system, I'm going to see if I can avoid this check.  Since, it's a problem that can only happen on the local node.)

VersionVectorContext helper methods of interest:
compareContexts(): sorts contexts by id, then steps through both contexts to determine pairwise: equality, superset, subset, disjoint.
mergeContexts(): creates a map from node id to count-timestamp pairs, then create a timestamp-sorted array and pulls off up to the max entries to form the new merged context.


      was (Author: kelvin):
    The patch adds a new package:
db/context

which contains:
IContext
VersionVectorContext

IContext is just a general interface to manipulate context byte[].  Basically, create(), update() and reconcile().

create(): create a new/empty context.
update(): update this context w/ the local node's id.
reconcile(): pass in a list of context-value pairs and it'll return a merged context (that supersedes all the passed in contexts) and a list of values that it couldn't automatically reconciled.

VersionVectorContext is a version vector implementation.

The VV format is a concatenated list of node id (IPv4's 4 bytes), count (int), and timestamp (long) tuples in a byte[].

create(): returns an empty byte[].
update(): will look for the local node's tuple in the byte[], increment its count, then prepend it to the front of the byte[] w/ an updated timestamp.  So, that the byte[] is always in timestamp descending order.
reconcile(): looks for all disjoint (incompatible) VVs and collapses all VVs that are a subset of another VV in the list.  (implementation note: if 2 VVs are equal, but they're values are not equivalent, both values will be added to the set of values that need to be manually reconciled.  It seems inefficient, though, so when I go through the rest of the system, I'm going to see if I can avoid this check.  Since, it's a problem that can only happen on the local node.)

VersionVectorContext helper methods of interest:
compareContexts(): sorts contexts by id, then steps through both contexts to determine pairwise: equality, superset, subset, disjoint.
mergeContexts(): creates a map from node id to count-timestamp pairs, then create a timestamp-sorted array and pulls off up to the max entries to form the new merged context.

  
> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793938#action_12793938 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

The patch adds a new package:
db/context

which contains:
IContext
VersionVectorContext

IContext is just a general interface to manipulate context byte[].  Basically, create(), update() and reconcile().

create(): create a new/empty context.
update(): update this context w/ the local node's id.
reconcile(): pass in a list of context-value pairs and it'll return a merged context (that supersedes all the passed in contexts) and a list of values that it couldn't automatically reconciled.

VersionVectorContext is a version vector implementation.

The VV format is a concatenated list of node id (IPv4's 4 bytes), count (int), and timestamp (long) tuples in a byte[].

create(): returns an empty byte[].
update(): will look for the local node's tuple in the byte[], increment its count, then prepend it to the front of the byte[] w/ an updated timestamp.  So, that the byte[] is always in timestamp descending order.
reconcile(): looks for all disjoint (incompatible) VVs and collapses all VVs that are a subset of another VV in the list.  (implementation note: if 2 VVs are equal, but they're values are not equivalent, both values will be added to the set of values that need to be manually reconciled.  It seems inefficient, though, so when I go through the rest of the system, I'm going to see if I can avoid this check.  Since, it's a problem that can only happen on the local node.)

VersionVectorContext helper methods of interest:
compareContexts(): sorts contexts by id, then steps through both contexts to determine pairwise: equality, superset, subset, disjoint.
mergeContexts(): creates a map from node id to count-timestamp pairs, then create a timestamp-sorted array and pulls off up to the max entries to form the new merged context.


> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin Kakugawa updated CASSANDRA-580:
--------------------------------------

    Attachment: 580-context-v1.patch

first-pass at version vector context.

comments appreciated.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785639#action_12785639 ] 

Jonathan Ellis commented on CASSANDRA-580:
------------------------------------------

just use "binary" instead of list<byte>

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803851#action_12803851 ] 

Jonathan Ellis commented on CASSANDRA-580:
------------------------------------------

kelvin, can you r/m patches from this that are of historical interest only, for the benefit of those who want to see what the current (or at least, most-recently-posted) patchset it?

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7
>
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-context-v4.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-580:
-------------------------------------

    Attachment: 580-thrift-v3.patch

The reason I switched to favoring server-side-only is because it lets us get by with much smaller API changes, as in the attached.

Also, why do we need a timestamp in LogicalClock when we have the counter?  I thought the whole point was to get away from the problems posed by timestamps.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793941#action_12793941 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

re: the node id format.
If Cassandra supports IPv6, then it might be advantageous to just use the IPv4 portion of it.  If that's possible.  Otherwise, IPv6's 16 bytes is kind of rough.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-580:
-------------------------------------

    Fix Version/s: 0.7

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>             Fix For: 0.7
>
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-context-v4.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794444#action_12794444 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

potential incompatibility w/ IColumn and IColumnContainer interfaces.

The key issue is that Column and SuperColumn rely on timestamps to indicate when deletion should occur.  In particular, through these methods:
getMarkedForDeleteAt()
mostRecentLiveChangeAt()

A separate, less interesting, issue is that SuperColumn auto-assumes Column for its sub-column serializer.

Right now, I've put together a new class, VersionColumn, that's context-based.  I've added context() methods to IColumn and added some instanceof checks.  However, I'm still exploring the implementation for the delete logic.

atm, I'm trying to avoid having to add a SuperVersionColumn class.  However, what is an interesting way to define the version "context" of a SuperColumn?  An aggregated context of all the sub-columns may be worth exploring.  Alternatively, since all the updates are timestamped, we could use that as a rough approximation for mostRecentLiveChangeAt(), but that seems to break the spirit of versioned contexts.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787100#action_12787100 ] 

Jonathan Ellis commented on CASSANDRA-580:
------------------------------------------

If it's going to be complex -- and it sounds like it is :) -- I'd be inclined to prefer the "timestamped vector clock, with truncation" approach.  (And since it's opaque to the client this can be changed later if we determine that the truncation actually is a problem in practice.)

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782256#action_12782256 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

Thanks Mateusz, I believe you are right that we want the version vector variant.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Kelvin Kakugawa (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796405#action_12796405 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

Thanks for your comments.

(1) I thought about putting the utility methods in FBUtilities.  However, I left them in VVC for this pass.  I can move them over.

(2) You're right about cBSA.  I didn't implement the cases that were already handled by VVC.  However, if I move it to FBUtilities, I'll look for all the missing cases.

(3) Thanks for the heads up.  I'll look into replacing those loops w/ arraycopy.

(4) My goal is to keep the context an opaque array, because I want to be support other version implementations.  i.e. interval tree clocks, which have a different format.  So, if I wanted to use an object representation, VVC would have to internally inflate the opaque context.  However, the manual byte manipulation isn't as easy to read as an object-based implementation and this was a concern of mine.

Right now, for the reconcile() method, I'm probably going to modify its interface.  Instead of using a List of Pair objects, I have a new IColumn impl, VectorColumn, that would probably be more appropriate than Pair.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-context-v1.patch, 580-context-v2.patch, 580-context-v3.patch, 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch, 580-thrift-v6.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782295#action_12782295 ] 

Stu Hood commented on CASSANDRA-580:
------------------------------------

> if the client has to resolve writes, you can no longer compact without involving the client.
Kindof: you can still compact SSTables, but you need to keep all conflicting versions, which is why options (a) and (c) are still feasible.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Mateusz Berezecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782197#action_12782197 ] 

Mateusz Berezecki commented on CASSANDRA-580:
---------------------------------------------

you probably want this: http://en.wikipedia.org/wiki/Version_vector

instead of : http://en.wikipedia.org/wiki/Vector_clock

see the section on different update rules.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785696#action_12785696 ] 

Jonathan Ellis commented on CASSANDRA-580:
------------------------------------------

LGTM

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-580) vector clock support

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782296#action_12782296 ] 

Jonathan Ellis commented on CASSANDRA-580:
------------------------------------------

Yes, that's how you'd have to change it.  I'd rather not; it would get messy.

(If we do server-side resolution we could still conceivably support both "classic" columnfamilies and vector clocked ones in the same Column and SuperColumn objects, just differing in their clock/timestamp field.  But if we have to potentially store multiple versions of a column in a single row for the vector clock version, then I think that is diverging too far and we'd have to split the implementation.  Remember that a classic ColumnFamily object just has a hash of its columns by name.)

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.