You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2014/02/14 14:20:20 UTC

[jira] [Commented] (CASSANDRA-6706) Duplicate rows returned when in clause has repeated values

    [ https://issues.apache.org/jira/browse/CASSANDRA-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901413#comment-13901413 ] 

Sylvain Lebresne commented on CASSANDRA-6706:
---------------------------------------------

That is kind of the intended behavior. Is it the best behavior? I don't know, though I'm not sure it matters much in practice tbh. But when there is an IN, we do order the resulting rows following the order of the values in the IN (unless there is an explicit ordering that takes precedence of course) which kind of suggest we consider the IN values as a list rather than a set, and from that perspective, it's probably not entirely crazy to return duplicate results in that case. In particular, if you use a prepared marker for an IN, the server will expect a list, not a set for the values (and changing now would really break users). It's easy enough to avoid the duplication client side if you don't want duplicates.

Don't get me wrong, I'm not saying not returning duplicate in that case would be inferior, but rather that I don't see a big problem with the current behavior and so that I'd rather not introduce a breaking change, even a small one, for no good reason.

> Duplicate rows returned when in clause has repeated values
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-6706
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6706
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Found on 
> [cqlsh 4.1.0 | Cassandra 2.0.3-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 19.38.0]
>            Reporter: Gavin Casey
>
> If a value is repeated within an IN clause then repeated rows are returned for  the repeats:
> cqlsh> create table t1(c1 text primary key);
> cqlsh> insert into t1(c1) values ('A');
> cqlsh> select * from t1;
>  c1
> ----
>   A
> cqlsh> select * from t1 where c1 = 'A';
>  c1
> ----
>   A
> cqlsh> select * from t1 where c1 in( 'A');
>  c1
> ----
>   A
> cqlsh:dslog> select * from t1 where c1 in( 'A','A');
>  c1
> ----
>   A
>   A



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)