You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2015/01/05 16:19:35 UTC

[jira] [Commented] (CASSANDRA-8490) DISTINCT queries with LIMITs or paging are incorrect when partitions are deleted

    [ https://issues.apache.org/jira/browse/CASSANDRA-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264665#comment-14264665 ] 

Sylvain Lebresne commented on CASSANDRA-8490:
---------------------------------------------

It is indeed annoying.

bq. Never count tombstoned partitions towards the limit, and trim excess partitions on the coordinator

As you said, this could theortically backfire for thrift and I'd rather not take the risk unless we really have no better option.

bq. Leave DISTINCT ... LIMIT queries broken, but partially fix the paging situation by only considering the query exhausted when the count of all rows in the fetch page

That's easy to do, and it almost surely won't have much noticeable drawbacks. But it only solve half of the problem ...

bq. Add a new, optional flag to the range command serialization format (keeping in mind the new countCQL3Rows flag in 2.1) or do something like use a special compositesToGroup() value of -2 to signal that tombstoned partitions should not count towards the limit

I think using -2 for {{compositesToGroup}} might actually have my preference. It's clearly hacky but it fixes the problem for both limit and paging and can be applied to 2.0 and 2.1 without breaking backward compatibility as far as I can tell. And while it's true that going form a 2.0 with this fix to a 2.1 without this fix would "break" those queries, it's really not specific to this issue. Besides, upgrading to 2.1 with a version that is older than your last upgrade of 2.0 is asking for trouble imo.

> DISTINCT queries with LIMITs or paging are incorrect when partitions are deleted
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8490
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8490
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Driver version: 2.1.3.
> Cassandra version: 2.0.11/2.1.2.
>            Reporter: Frank Limstrand
>            Assignee: Tyler Hobbs
>             Fix For: 2.0.12, 2.1.3
>
>
> Using paging demo code from https://github.com/PatrickCallaghan/datastax-paging-demo
> The code creates and populates a table with 1000 entries and pages through them with setFetchSize set to 100. If we then delete one entry with 'cqlsh':
> {noformat}
> cqlsh:datastax_paging_demo> delete from datastax_paging_demo.products  where productId = 'P142'; (The specified productid is number 6 in the resultset.)
> {noformat}
> and run the same query ("Select * from") again we get:
> {noformat}
> [com.datastax.paging.Main.main()] INFO  com.datastax.paging.Main - Paging demo took 0 secs. Total Products : 999
> {noformat}
> which is what we would expect.
> If we then change the "select" statement in dao/ProductDao.java (line 70) from "Select * from " to "Select DISTINCT productid from " we get this result:
> {noformat}
> [com.datastax.paging.Main.main()] INFO  com.datastax.paging.Main - Paging demo took 0 secs. Total Products : 99
> {noformat}
> So it looks like the tombstone stops the paging behaviour. Is this a bug?
> {noformat}
> DEBUG [Native-Transport-Requests:788] 2014-12-16 10:09:13,431 Message.java (line 319) Received: QUERY Select DISTINCT productid from datastax_paging_demo.products, v=2
> DEBUG [Native-Transport-Requests:788] 2014-12-16 10:09:13,434 AbstractQueryPager.java (line 98) Fetched 99 live rows
> DEBUG [Native-Transport-Requests:788] 2014-12-16 10:09:13,434 AbstractQueryPager.java (line 115) Got result (99) smaller than page size (100), considering pager exhausted
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)