You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2015/01/11 22:18:36 UTC

[jira] [Commented] (CASSANDRA-7282) Faster Memtable map

    [ https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273063#comment-14273063 ] 

Benedict commented on CASSANDRA-7282:
-------------------------------------

Just a note for the future (I haven't yet addressed your comments): It may be we want to harden the data structure against determined attackers that produce a lot of colliding tokens with different partition keys. 

It's not a terribly complex modification to make this a skip-list, instead of a linked-list (since it is add-only, maintaining a skip-list is just applying the current linked-list logic N times, and the majority of complexity here is for the hash index). The hash index would remain an _optimisation_ for expected constant time access into a logarithmic structure, instead of a linear one, giving us best- and expected- case constant time, worst case logarithmic. The skip-list index levels could perhaps be dropped as the hash index extends to overlap them, although it may be cheaper to encode them all in the nodes themselves. This should probably be a follow up ticket, but I wanted to log the thought as it occurred to me.

> Faster Memtable map
> -------------------
>
>                 Key: CASSANDRA-7282
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in our memtables. Maintaining this is an O(lg(n)) operation; since the vast majority of users use a hash partitioner, it occurs to me we could maintain a hybrid ordered list / hash map. The list would impose the normal order on the collection, but a hash index would live alongside as part of the same data structure, simply mapping into the list and permitting O(1) lookups and inserts.
> I've chosen to implement this initial version as a linked-list node per item, but we can optimise this in future by storing fatter nodes that permit a cache-line's worth of hashes to be checked at once,  further reducing the constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)