You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Mircea Lemnaru (JIRA)" <ji...@apache.org> on 2016/03/08 10:12:40 UTC

[jira] [Commented] (CASSANDRA-8940) Inconsistent select count and select distinct

    [ https://issues.apache.org/jira/browse/CASSANDRA-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184697#comment-15184697 ] 

Mircea Lemnaru commented on CASSANDRA-8940:
-------------------------------------------

[~frensjan] [~blerer] I have a question regarding this if I may and didn't know where else to put it. We are currently implementing a solution which has Cassandra as a DB and we chose to go with Cassandra 3.3. 

Unfortunately when doing some tests we found out that this issue is reproducible in 3.3 and as I see this it's is not planned to port the fix for this to 3.3+ as well. Am I assuming correctly ? Will this issue be ported to 3.3 in the future as well ?

We plan to go in production in 3-5 months time , did we make a poor decision in going with Cassandra 3.3 ? Should we go with 2.2 instead ? 

Thank you for all your help
Btw , great job on this one.

Thanks 
Mircea 

> Inconsistent select count and select distinct
> ---------------------------------------------
>
>                 Key: CASSANDRA-8940
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8940
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>         Environment: 2.1.2
>            Reporter: Frens Jan Rumph
>            Assignee: Benjamin Lerer
>             Fix For: 2.0.16, 2.1.6
>
>         Attachments: 7b74fb00-e935-11e4-b10c-317579db7eb4.csv, 8940.txt, 8d5899d0-e935-11e4-847b-2d06da75a6cd.csv, Vagrantfile, install_cassandra.sh, setup_hosts.sh
>
>
> When performing {{select count( * ) from ...}} I expect the results to be consistent over multiple query executions if the table at hand is not written to / deleted from in the mean time. However, in my set-up it is not. The counts returned vary considerable (several percent). The same holds for {{select distinct partition-key-columns from ...}}.
> I have a table in a keyspace with replication_factor = 1 which is something like:
> {code}
> CREATE TABLE tbl (
>     id frozen<id_type>,
>     bucket bigint,
>     offset int,
>     value double,
>     PRIMARY KEY ((id, bucket), offset)
> )
> {code}
> The frozen udt is:
> {code}
> CREATE TYPE id_type (
>     tags map<text, text>
> );
> {code}
> The table contains around 35k rows (I'm not trying to be funny here ...). The consistency level for the queries was ONE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)