You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2015/06/14 11:19:01 UTC

[jira] [Commented] (CASSANDRA-9471) Columns should be backed by a BTree, not an array

    [ https://issues.apache.org/jira/browse/CASSANDRA-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585005#comment-14585005 ] 

Benedict commented on CASSANDRA-9471:
-------------------------------------

I've pushed a branch [here|https://github.com/belliottsmith/cassandra/tree/9471]

This patch does quite a few things, since I thought we had better get them out of the way sooner than later:

# Introduces "indexing" to the BTree class, permitting lookup by positional index, and binarySearch semantics (so inequality lookups yield the position a binarySearch of the equivalent array would)
#* The size modulo remainder of a Leaf/Branch are flipped, so that a Branch has a spare slot at the end for an int[] index:
#** if missing, this occupies at most 1 bit per entry in the set (often 1/32), if present, it occupies at most 2.5 bits, and typically around 1 bit. Sets with <= 32 items incur no cost.
#* This index just contains the cumulative size of the tree preceding each branch key, rooted at the node in question
# Reintroduces BTreeSet, with extra methods for accessing by index\*, and simple implementations of missing NavigableMap methods using this new functionality
# Introduces a simple BTreeSet.Builder, and replaces all suitable uses of TreeMap in the codebase with a BTreeSet.Builder -> BTreeSet
# Rewrites btree.Cursor to make it far easier to understand, and to introduced IndexedSearchIterator, exposing the new indexing
# Reintroduces LongBTreeTest, with cleaned up more exhaustive coverage of iterator functionality, and new NavigableMap methods

It's more code than I expected, as I didn't plan on rewriting btree.Cursor, but it was frankly a bit of a mess. 8099 also extends it to cover descending ranges, but with no test coverage, and nor did I want to reason about (my own!) code when inverted in this way to corroborate its behaviour. The new implementation is far more declarative, and I hope should be easy for just about anybody to read and review, and crucially to extend/modify in future.

The Cursor rewrite was also sparked by the introduction of IndexedSearchIterator, which was designed to drop into SimpleRowDataBlock.CellWriter. Unfortunately, the comments here _suggesting_ columns would be provided in-order was optimistic. It seems that the call-sites would need to at least specify if this is the case. This means that the Cursor rewrite is not helpful for this _yet_, but hopefully will be soon, and I think it was warranted by our dependence on new functionality and by its existing ugliness.

It is possible we could split this ticket into (1, 5.1); (2, 5.2); 3; (4, 5.3), but this all seemed naturally grouped together, and given it's a follow on to 8099...

I was hoping [~blambov] could review the changes to BTree, BTreeSet and Cursor in particular? If [~slebresne] or [~blerer] could review the more modest places it touches the 8099 code that would be great.

\*A simple follow up to this ticket would be to make BTreeSet implement both {{Set}} _and_ {{List}}

> Columns should be backed by a BTree, not an array
> -------------------------------------------------
>
>                 Key: CASSANDRA-9471
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9471
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>             Fix For: 3.0 beta 1
>
>
> Follow up to 8099. 
> We have pretty terrible lookup performance as the number of columns grows (linear). In at least one location, this results in quadratic performance. 
> We don't however want this structure to be either any more expensive to build, nor to store. Some small modifications to BTree will permit it to serve here, by permitting efficient lookup by index, and calculation _of_ index for a given key.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)