You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Alex Petrov (JIRA)" <ji...@apache.org> on 2016/12/06 19:18:58 UTC
[jira] [Comment Edited] (CASSANDRA-12910) SASI: calculatePrimary()
always returns null
[ https://issues.apache.org/jira/browse/CASSANDRA-12910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15726419#comment-15726419 ]
Alex Petrov edited comment on CASSANDRA-12910 at 12/6/16 7:18 PM:
------------------------------------------------------------------
I can see no correlation between filled columns in rows and this patch.
Let's say there are two sstables:
{code}
| a | b | c |
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| a | b | c |
| 4 | 4 | 4 |
| 5 | 5 | 2 |
{code}
With a {{PRIMARY KEY a}} . When querying for {{SELECT * FROM tbl WHERE b = 5 AND c = 2}}. Now, results for the column {{b}} are only in the second sstable. Results for the column {{c}} are both in the first and in second sstable. Since we're doing {{AND}} query, we can conclude that in order to obtain all necessary results, it will be enough to query the second sstable, so we're picking the index on the column {{b}} as primary and instead of using indexes over two sstables, are using indexes for only one sstable, as specified [here|https://github.com/ifesdjeen/cassandra/blob/8a64718d8447029584e24b3a5b75cde70e835dd7/src/java/org/apache/cassandra/index/sasi/plan/QueryController.java#L208-L212].
was (Author: ifesdjeen):
I can see no correlation between filled columns in rows and this patch.
Let's say there are two sstables:
{code}
| a | b | c |
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| a | b | c |
| 4 | 4 | 4 |
| 5 | 5 | 2 |
{code}
With a {{PRIMARY KEY a}} . When querying for {{SELECT * FROM tbl WHERE b = 5 AND c = 2}}. Now, results for the column {{b}} are only in the second sstable. Results for the column {{c}} are both in the first and in second sstable. Since we're doing {{AND}} query, we can conclude that in order to obtain all necessary results, it will be enough to query the second sstable, so we're picking the index on the column {{b}} as primary and instead of using indexes over two sstables, are using indexes for only one sstable, as specified [here|https://github.com/ifesdjeen/cassandra/blob/8a64718d8447029584e24b3a5b75cde70e835dd7/src/java/org/apache/cassandra/index/sasi/plan/QueryController.java#L208-L212].
> SASI: calculatePrimary() always returns null
> --------------------------------------------
>
> Key: CASSANDRA-12910
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12910
> Project: Cassandra
> Issue Type: Bug
> Components: sasi
> Reporter: Corentin Chary
> Assignee: Corentin Chary
> Priority: Minor
> Fix For: 3.x
>
> Attachments: 0002-sasi-fix-calculatePrimary.patch
>
>
> While investigating performance issues with SASI (https://github.com/criteo/biggraphite/issues/174 if you want to know more) I ended finding calculatePrimary() in QueryController.java which apparently should return the "primary index".
> It lacks documentation, and I'm unsure what the "primary index" should be, but apparently this function never returns one because primaryIndexes.size() is always 0.
> https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/index/sasi/plan/QueryController.java#L237
> I'm unsure if the proper fix is checking if the collection is empty or reversing the operator (selecting the index with higher cardinality versus the one with lower cardinality).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)