You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Andrés de la Peña (JIRA)" <ji...@apache.org> on 2017/04/26 15:27:04 UTC

[jira] [Comment Edited] (CASSANDRA-8272) 2ndary indexes can return stale data

    [ https://issues.apache.org/jira/browse/CASSANDRA-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981476#comment-15981476 ] 

Andrés de la Peña edited comment on CASSANDRA-8272 at 4/26/17 3:26 PM:
-----------------------------------------------------------------------

If we send a {{RangeTombstoneMarker}} each time we find a deleted index entry, the coordinator will be able to discard the false positives returned the stale node. The problem is that read repair will send back the tombstones to the nodes, corrupting not only the index but also the indexed table. Possible solutions could be to disable read repair for index queries or sending a new type of tombstone that read repair would ignore. 

As an alternative solution, the index could return also the rows pointed by the deleted index entries, without any information about the staleness of the index entries, and use {{Index.postProcessorFor(ReadCommand)}} to discard those rows that doesn't satisfy the index expression after reconcilliation. This would solve the consistency problem without any changes in read repair, or in the coordinator in general. The downside is that we should read in the base table, and possibly send, all the rows pointed by deleted index entries satisfying the expression since last gc. 

I'm working in this last approach here:

||[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...adelapena:8272-3.0]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-3.0-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-3.0-dtest/]|

The patch is still uncomplete, I'm posting it just to illustrate the approach. I have not yet added dtests for the scenario described by this ticket, although I've tried it manually, and existing dtests pass. 

{{PartitionRangeReadCommand}} overrides {{ReadCommand.executeInternal(ReadOrderGroup)}} to use the index post-processor, that now is required to let the index clean the stale entries.


was (Author: adelapena):
If we send a {{RangeTombstoneMarker}} each time we find a deleted index entry, the coordinator will be able to discard the false positives returned the stale node. The problem is that read repair will send back the tombstones to the nodes, corrupting not only the index but also the indexed table. Possible solutions could be to disable read repair for index queries or sending a new type of tombstone that read repair would ignore. 

As an alternative solution, the index could return also the rows pointed by the deleted index entries, without any information about the staleness of the index entries, and use {{Index.postProcessorFor(ReadCommand)}} to discard those rows that doesn't satisfy the index expression after reconcilliation. This would solve the consistency problem without any changes in read repair, or in the coordinator in general. The downside is that we should read in the base table, and possibly send, all the rows pointed by deleted index entries satisfying the expression since last gc. 

I'm working in this last approach here:

||[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...adelapena:8272-3.0]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-3.0-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-3.0-dtest/]|

The patch is still uncomplete, I'm posting it just to illustrate the approach. I have not yet added dtests for the scenario described by this ticket, although I've tried it manually, and existing dtests pass. There are many unit tests failing because {{CQLTester}} relies on {{QueryProcessor.executeInternal}}, that doesn't use {{Index.postProcessorFor(ReadCommand)}}, and I guess it should do it.

> 2ndary indexes can return stale data
> ------------------------------------
>
>                 Key: CASSANDRA-8272
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8272
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Andrés de la Peña
>             Fix For: 2.1.x
>
>
> When replica return 2ndary index results, it's possible for a single replica to return a stale result and that result will be sent back to the user, potentially failing the CL contract.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v text);
> CREATE INDEX ON test(v);
> INSERT INTO test(k, v) VALUES (0, 'foo');
> {noformat}
> with every replica up to date. Now, suppose that the following queries are done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v = 'bar' WHERE k = 0;
> SELECT * FROM test WHERE v = 'foo';
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before having applied the insert, then the now stale result will be returned (since C will return it and A or B will return nothing).
> A potential solution would be that when we read a tombstone in the index (and provided we make the index inherit the gcGrace of it's parent CF), instead of skipping that tombstone, we'd insert in the result a corresponding range tombstone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)