You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "C. Scott Andreas (JIRA)" <ji...@apache.org> on 2018/11/18 18:11:00 UTC

[jira] [Updated] (CASSANDRA-9796) Give 8099's like treatment to partition keys

     [ https://issues.apache.org/jira/browse/CASSANDRA-9796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

C. Scott Andreas updated CASSANDRA-9796:
----------------------------------------
    Component/s: Local Write-Read Paths

> Give 8099's like treatment to partition keys
> --------------------------------------------
>
>                 Key: CASSANDRA-9796
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9796
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths
>            Reporter: Sylvain Lebresne
>            Priority: Major
>             Fix For: 4.x
>
>
> Post-8099, we properly distinguish clustering columns at the engine level, which allows use somewhat more efficient encoding: we don't write the size of values of fixed width types, and we can properly store null values (which will likely prove useful for CASSANDRA-6477 for instance).
> Partition keys however have had no such love: the storage engine still manipulate them like a single blob and their encoding is not terribly efficient: we always store the size of every values (even fixed width ones) and for compound values we even store the size of the full partition key even though it's redundant with the individual value sizes. The encoding also don't allow nulls, which is inconvenient at least for CASSANDRA-6477.
> So I'd like to improve on this by:
> # making the {{DecoratedKey}} API (which I'd personally rename into {{PartitionKey}}) expose the fact that we can have more than one value.  Typically by adding {{size()}} and {{get\(i\)}} methods like for {{Clustering}}.  This would simplify a couple of places in the code where we still manually decompose such values in particular.
> # improve their encoding. An easy/consistent solution for that would be reuse the same encoding than for {{Clustering}} (they are the same kind of beast), though I'm open to other options.
> One small subtlety to be aware of is that whatever we do to the internal encoding/implementation, we must make sure we still compute the same tokens.  But that's not particularly hard either.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org