You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2012/12/10 17:25:21 UTC

[jira] [Commented] (CASSANDRA-5021) Full partition/cell index integration

    [ https://issues.apache.org/jira/browse/CASSANDRA-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528041#comment-13528041 ] 

Sylvain Lebresne commented on CASSANDRA-5021:
---------------------------------------------

For what it's worth, I advise that before getting to this we start by CASSANDRA-4478. Whatever the exact implementation of this ticket is, it will break the current assumption that the in-memory index summary is composed of fixed interval of keys. So the difficulties (related to keys estimate) discussed in CASSANDRA-4478 will be the same. So this won't be a waste of time imo.

That aside, I think this ticket involves:
* Merging RowIndexEntry and IndexHelper. I see two options: either the resulting entry look like:
   {{(start key, start cell name) -> (end key, end cell name)}}
 which has the benefit that for skinny rows, we can have more than one row per index entry. But I suspect this would complicate the implementation quite a bit, because it would means the index wouldn't have all row keys and I think this would require more changes (for row iteration, scrubs, ...).  Otherwise, we can keep the fact that an entry can't hold more than one key and have an index entry be:
   {{key + (start cell name -> end cell name)}}
* Merge ColumnIndex.Builder and SSTableWriter.IndexWriter to write the new merged index entries.
* SSTableReader.getPosition() will need to be changed to take the start cell name as argument, not just the key, and return a new style index entry.
* The consumer of SSTableReader.getPosition() will need to be update accordingly.

I do have to note that it's unclear to me what become of the key cache if we do this. Indeed, an index entry position will be defined by a key and the start cell name of the index block. But that has almost no cacheability: requests have little chance that their start be an index block start. I don't think that's a detail, and I'd suggest considering this before jumping into implementing this ticket.

                
> Full partition/cell index integration
> -------------------------------------
>
>                 Key: CASSANDRA-5021
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5021
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>
> CASSANDRA-2319 pulled the row's (partition's) index of cells into the -Index component, but it's still treated separately.  That is: on a read, we first bsearch the partition key samples, then we read the Index component to find the exact partition key, then we deserialize the cell samples and bsearch those.
> "deserialize the cell samples" grows linearly with partition size and can seriously impact query time as it grows past millions of cells to 10s and 100s of millions.
> If we merged the cell index with the partition's, we could do a single bsearch/read step that would scale with log(N).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira