You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Alex Petrov (JIRA)" <ji...@apache.org> on 2016/06/26 15:15:33 UTC
[jira] [Commented] (CASSANDRA-11990) Address rows rather than partitions in SASI

    [ https://issues.apache.org/jira/browse/CASSANDRA-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350152#comment-15350152 ] 

Alex Petrov commented on CASSANDRA-11990:
-----------------------------------------

I've done a small investigation of what it'd take and talked to several people about potential scenarios. First of all, I'd indicate that it'll be a rather big change, which will require a new format for writing the SASI {{TokenTree}}. I'll list several steps that would need to be taken: 

  * tl;dr version: we have to extend TokenTree to fit row offset along with partition key. More elaborate version: currently SASI is a highly optimised tree that aims to encode a tree of {{long token}}/{{short + int}} entries. Since the max offset size does not exceed 48 bytes, there are more optimisations involved. It takes several optimisation steps to improve read performance and storage overhead. Short description of current format can be found [here|https://gist.github.com/ifesdjeen/0436faf9a66b401ace0ad0947d256317]. Since we'll have to hold two offsets (partition and row offset, Partition offset is required to read the PK and static rows etc), on the first step, for the proof of concept, we'll reduce the number of distinction to more simple cases (single and multiple offsets).  The rest of possible combinations of optimisation (with the most obvious being when both items fit into the single long, and possibly adding more distinctions if they're flexibly skippable/addressable). 
  * we need to extend TokenTree to support other partitioners (whether or not it's going to be done in scope of this ticket, we'll have to make sure we're not making it harder to extend it this way.
  * there might be a need to store the order-preserving hash of clustering keys for queries where row is split across multiple SSTables, although I have to gather more information on that one, as we might be able to resolve rows after reading them from sstables. 
  * we'll need to find migration/upgrade paths from current format, which may involve re-indexing and failing queries while upgrade is in process or supporting two format versions at read time, to support reads from old format while indexes are rebuilt. 

cc [~xedin] [~beobal] [~jrwest] 

> Address rows rather than partitions in SASI
> -------------------------------------------
>
>                 Key: CASSANDRA-11990
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11990
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Alex Petrov
>            Assignee: Alex Petrov
>
> Currently, the lookup in SASI index would return the key position of the partition. After the partition lookup, the rows are iterated and the operators are applied in order to filter out ones that do not match.
> bq. TokenTree which accepts variable size keys (such would enable different partitioners, collections support, primary key indexing etc.), 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)