You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (Jira)" <ji...@apache.org> on 2020/08/21 16:37:00 UTC
[jira] [Commented] (CASSANDRA-16069) Loss of functionality around null clustering when dropping compact storage

    [ https://issues.apache.org/jira/browse/CASSANDRA-16069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17181991#comment-17181991 ] 

Sylvain Lebresne commented on CASSANDRA-16069:
----------------------------------------------

I'll note that it's not obvious to me what is the best course of action here. Here's the options I see, none of which is great imo:
# we extend support for {{null}} in clustering columns to all tables. But the reason we didn't support this in the first place is that we felt this might be more confusing than helpful. After all, this isn't a thing in SQL. Of course, we can revisit that opinion, but I think we should be very careful with that kind of additive semantic changes (once it's there, it's there forever). And for this ticket, we'd have to make the change in 3.0/3.11, which, well, feels scary to me.
# we make a special case for tables "that used to be compact" and support {{null}} clustering only for those. But technically, we have no way to detect those tables as of now, {{DROP COMPACT STORAGE}} does not leave any trace. Even if we added such trace, which was already suggested as one of the option for CASSANDRA-15897, that trace would (mostly) not be user visible, so that would become of pretty confusing rule probably.
# we do nothing (outside of documentation). Which sounds preposterous at face value, but to play devil's advocate for a minute: this behavior is pretty specific in the first place, and I don't think we document it anywhere. So it's not improbable that only a very tiny fraction of users rely on this. There has to be point where _if_ we believe the other options are bad for C* in general, then it becomes better to say to a handful users "Sorry, you will have to either find a way to migrate out of this behavior or stay on 3.0/3.11".


> Loss of functionality around null clustering when dropping compact storage
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16069
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16069
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/CQL
>            Reporter: Sylvain Lebresne
>            Priority: Normal
>
> For backward compatibility reasons[1], it is allowed to insert rows where some of the clustering columns are {{null}} for compact tables. That support is a tad limited/inconsistent[2] but essentially you can do:
> {noformat}
> cqlsh:ks> CREATE TABLE t (k int, c1 int, c2 int, v int, PRIMARY KEY (k, c1, c2)) WITH COMPACT STORAGE;
> cqlsh:ks> INSERT INTO t(k, c1, v) VALUES (1, 1, 1);
> cqlsh:ks> SELECT * FROM t;
>  k | c1 | c2   | v
> ---+----+------+---
>  1 |  1 | null | 1
> (1 rows)
> cqlsh:ks> UPDATE t SET v = 2 WHERE k = 1 AND c1 = 1;
> cqlsh:ks> SELECT * FROM t;
>  k | c1 | c2   | v
> ---+----+------+---
>  1 |  1 | null | 2
> (1 rows)
> {noformat}
> This is not allowed on non-compact tables however:
> {noformat}
> cqlsh:ks> CREATE TABLE t2 (k int, c1 int, c2 int, v int, PRIMARY KEY (k, c1, c2));
> cqlsh:ks> INSERT INTO t2(k, c1, v) VALUES (1, 1, 1);
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Some clustering keys are missing: c2"
> cqlsh:ks> UPDATE t2 SET v = 2 WHERE k = 1 AND c1 = 1;
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Some clustering keys are missing: c2"
> {noformat}
> Which means that a user with a compact table that rely on this will not be able to use {{DROP COMPACT STORAGE}}.
> Which is a problem for the 4.0 upgrade story. Problem to which we need an answer.
>  
> ----
> [1]: the underlying {{CompositeType}} used by such tables allows to provide only a prefix of components, so thrift users could have used such functionality. We thus had to support it in CQL, or those users wouldn't have been able to upgrade to CQL easily.
> [2]: building on the example above, the value for {{c2}} is essentially {{null}}, yet none of the following is currently allowed:
> {noformat}
> cqlsh:ks> INSERT INTO t(k, c1, c2, v) VALUES (1, 1, null, 1);
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid null value in condition for column c2"
> cqlsh:ks> UPDATE t SET v = 2 WHERE k = 1 AND c1 = 1 AND c2 = null;
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid null value in condition for column c2"
> cqlsh:ks> SELECT * FROM c WHERE k = 1 AND c1 = 1 AND c2 = null;
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid null value in condition for column c2"
> {noformat}
> Not only is that unintuitive/inconsistent, but the {{SELECT}} one means there is no way to select only the row. You can skip specifying {{c2}} in the {{SELECT}}, but this become a slice selection essentially, as shown below:
> {noformat}
> cqlsh:ks> INSERT INTO ct(k, c1, c2, v) VALUES (1, 1, 1, 1);
> cqlsh:ks> SELECT * FROM ct WHERE k = 1 AND c1 = 1;
>  k | c1 | c2   | v
> ---+----+------+---
>  1 |  1 | null | 1
>  1 |  1 |    1 | 1
> (2 rows)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org