You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Alex Petrov (Jira)" <ji...@apache.org> on 2021/02/05 10:12:01 UTC

[jira] [Comment Edited] (CASSANDRA-16307) GROUP BY queries with paging can return deleted data

    [ https://issues.apache.org/jira/browse/CASSANDRA-16307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279067#comment-17279067 ] 

Alex Petrov edited comment on CASSANDRA-16307 at 2/5/21, 10:11 AM:
-------------------------------------------------------------------

[~maedhroz][~adelapena] I've fixed the test right away, just didn't expect the patch is going to get attention right away, and was focusing on fixing the real group by/paging/srp bug. The test failure in compact storage upgrade test is just a conseqeunce of the test not complying to {{Iterator}} interface and not calling {{hasNext}} before calling {{next}}. 

That said, I'm pretty close to the fix of the actual issue, just running some more Harry tests to verify the fix. It is caused by the fact that for group by, we're increasing data limit counters only after the next row was seen, which is why SRP thinks there's no need to try to fetch more contents.

UPD: I'll move the in-jvm dtest paging problem to a separate issue, and will put [~maedhroz] and [~adelapena] as reviewers there. Thank you!


was (Author: ifesdjeen):
[~maedhroz][~adelapena] I've fixed the test right away, just didn't expect the patch is going to get attention right away, and was focusing on fixing the real group by/paging/srp bug. The test failure in compact storage upgrade test is just a conseqeunce of the test not complying to {{Iterator}} interface and not calling {{hasNext}} before calling {{next}}. 

That said, I'm pretty close to the fix of the actual issue, just running some more Harry tests to verify the fix. It is caused by the fact that for group by, we're increasing data limit counters only after the next row was seen, which is why SRP thinks there's no need to try to fetch more contents.

> GROUP BY queries with paging can return deleted data
> ----------------------------------------------------
>
>                 Key: CASSANDRA-16307
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16307
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination
>            Reporter: Andres de la Peña
>            Assignee: Alex Petrov
>            Priority: Normal
>             Fix For: 3.11.x, 4.0-beta
>
>
> {{GROUP BY}} queries using paging and CL>ONE/LOCAL_ONE. This dtest reproduces the problem:
> {code:java}
> try (Cluster cluster = init(Cluster.create(2)))
> {
>     cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (pk int, ck int, PRIMARY KEY (pk, ck))"));
>     ICoordinator coordinator = cluster.coordinator(1);
>     coordinator.execute(withKeyspace("INSERT INTO %s.t (pk, ck) VALUES (0, 0)"), ConsistencyLevel.ALL);
>     coordinator.execute(withKeyspace("INSERT INTO %s.t (pk, ck) VALUES (1, 1)"), ConsistencyLevel.ALL);
>     
>     cluster.get(1).executeInternal(withKeyspace("DELETE FROM %s.t WHERE pk=0 AND ck=0"));
>     cluster.get(2).executeInternal(withKeyspace("DELETE FROM %s.t WHERE pk=1 AND ck=1"));
>     String query = withKeyspace("SELECT * FROM %s.t GROUP BY pk");
>     Iterator<Object[]> rows = coordinator.executeWithPaging(query, ConsistencyLevel.ALL, 1);
>     assertRows(Iterators.toArray(rows, Object[].class));
> }
> {code}
> Using a 2-node cluster and RF=2, the test inserts two partitions in both nodes. Then it locally deletes each row in a separate node, so each node sees a different partition alive, but reconciliation should produce no alive partitions. However, a {{GROUP BY}} query using a page size of 1 wrongly returns one of the rows.
> This has been detected during CASSANDRA-16180, and it is probably related to CASSANDRA-15459, which solved a similar problem for group-by queries with limit, instead of paging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org