You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2009/03/12 00:10:53 UTC

[jira] Created: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Allow non-hash-based partitioning schemes to allow truly order-preserving storage
---------------------------------------------------------------------------------

                 Key: CASSANDRA-3
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
             Project: Cassandra
          Issue Type: New Feature
            Reporter: Jonathan Ellis
         Attachments: partition-1.patch, partition-2.patch

An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689182#action_12689182 ] 

Jun Rao commented on CASSANDRA-3:
---------------------------------

Jonathan,

Do you imagine that one can implement some sort of range partitioning using your patch? The thing with range partitioning is that the partitioner has to keep some states. I am not sure how to enforce the partitioners on different nodes to maintain the same states. Do you require the patitioner to be stateless?


> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch, partition-6.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689319#action_12689319 ] 

Jonathan Ellis commented on CASSANDRA-3:
----------------------------------------

Well, in the most generic sense, OrderPreservingPartitioner already implements range partitioning.  Each node gets a String as a token and all keys between its token and the next on the lexicographical ring get stored on that node.  But I'm not sure if that answers your question. :)

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch, partition-6.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3:
-----------------------------------

    Attachment: partition-3.patch

partition-3 consolidates partition behavior in IPartitioner, so creating a new partitioner should be only a matter of implementing that interface.  all the external switch statements on PartitionerType have been folded into that.

SSTable is now the only part of the code that cares about the distinction between a "raw" key and a "decorated" key.  variables in that class have been named clientKey or decoratedKey to show which is which.  others don't care either because they only deal with decorated keys (SequenceFile) or only with client keys (everyone else).  as part of this, I've merged some overloaded methods with substantially duplicated code to simplify auditing these changes.

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3:
-----------------------------------

    Attachment: partition-2.patch

partition-2 removes unused BigInteger imports.

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-3.
------------------------------------

    Resolution: Fixed

committed

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch, partition-6.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3:
-----------------------------------

    Attachment: partition-4.patch

r/m unused code dealing with Ranges and tokens to make generalizing that code easier

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689563#action_12689563 ] 

Jonathan Ellis commented on CASSANDRA-3:
----------------------------------------

Right.  Neither does the random partitioner.  In both cases imo the "right" solution is to adjust the token ranges through load balancing.  Which is a separate feature (on my list to do but not as high priority.  On FB's list too, I think.)

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch, partition-6.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reopened CASSANDRA-3:
------------------------------------


Avinash thinks he has a better solution involving a working OPHF.  This code has been reverted pending that.

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch, partition-6.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3:
-----------------------------------

    Attachment: partition-5.patch

migrate from BigInteger to abstract Token, with BigIntegerToken and StringToken subclasses controlled by Random and OrderPreserving partitioners, respectively

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3:
-----------------------------------

    Attachment: partition-6.patch

Fix NPE

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch, partition-6.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3:
-----------------------------------

    Attachment: partition-1.patch

partition-1 removes unused code dealing with tokens and hashes to make the upgrade simpler.

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reassigned CASSANDRA-3:
--------------------------------------

    Assignee: Jonathan Ellis

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch, partition-6.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689521#action_12689521 ] 

Jun Rao commented on CASSANDRA-3:
---------------------------------

The problem with OrderPreservingPartition is that it doesn't guarantee even distribution for arbitrary keys.

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch, partition-6.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (CASSANDRA-3) Allow non-hash-based partitioning schemes to allow truly order-preserving storage

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-3.
------------------------------------

    Resolution: Duplicate

This was resubmitted in pieces for review, culminating in CASSANDRA-71.

> Allow non-hash-based partitioning schemes to allow truly order-preserving storage
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>         Attachments: partition-1.patch, partition-2.patch, partition-3.patch, partition-4.patch, partition-5.patch, partition-6.patch
>
>
> An order-preserving hash has too many limitations to be useful in production where key lengths tend to have low variance.  We need to make Cassandra more flexible and define a partitioner as responsible for String -> EndPoint instead of String -> BigInteger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.