You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Alex Petrov (Jira)" <ji...@apache.org> on 2020/05/14 09:31:00 UTC

[jira] [Updated] (CASSANDRA-15811) Improve DROP COMPACT STORAGE

     [ https://issues.apache.org/jira/browse/CASSANDRA-15811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Petrov updated CASSANDRA-15811:
------------------------------------
    Description: 
DROP COMPACT STORAGE was introduced in CASSANDRA-10857 as one of the steps to deprecate Thrift. However, current semantics of dropping compact storage flags from tables reveal several columns that are usually empty (colum1 and value in non-dense case, value for dense columns, and a column with an empty name for super column families). Showing these columns  can confuse application developers, especially ones that have never used thrift and/or made writes that assumed presence of those fields, and used compact storage in 3.x because is has “compact” in the name.

There’s not much we can do in a super column family case, especially considering there’s no way to create a supercolumn family using CQL, but we can improve dense and non-dense cases. We can scans stables and make sure there are no signs of thrift writes in them, and if all sstables conform to this rule, we can not only drop the flag, but also drop columns that are supposed to be hidden.

Since for fixing CASSANDRA-15778, and to allow EmptyType column to actually have data[*] we had to remove empty type validation, properly handling compact storage starts making more sense, but we’ll solve it through not having columns, hence not caring about values instead, or keeping values _and_ data, not requiring validation in this case. EmptyType field will have to be handled differently though.

[*] as it is possible to end up with sstables upgraded from 2.x or written in 3.x before CASSANDRA-15373, which means not every 2.x upgraded or 3.x cluster is guaranteed to have empty values in this column, and this behaviour, even if undesired, might be used by people. 

Open question is: CASSANDRA-15373 adds validation to EmptyType that disallows any non-empty value to be written to it, but we already allow creating table via CQL, and still write data into it with thrift. It seems to have been unintended, but it might have become a feature people rely on. If we simply back port 15373 to 2.2 and 2.1, we’ll change and will break behaviour. Given no-one complained in 3.0 and 3.11, this assumption is unlikely though. 

  was:[TBD]


> Improve DROP COMPACT STORAGE
> ----------------------------
>
>                 Key: CASSANDRA-15811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15811
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Alex Petrov
>            Priority: Normal
>
> DROP COMPACT STORAGE was introduced in CASSANDRA-10857 as one of the steps to deprecate Thrift. However, current semantics of dropping compact storage flags from tables reveal several columns that are usually empty (colum1 and value in non-dense case, value for dense columns, and a column with an empty name for super column families). Showing these columns  can confuse application developers, especially ones that have never used thrift and/or made writes that assumed presence of those fields, and used compact storage in 3.x because is has “compact” in the name.
> There’s not much we can do in a super column family case, especially considering there’s no way to create a supercolumn family using CQL, but we can improve dense and non-dense cases. We can scans stables and make sure there are no signs of thrift writes in them, and if all sstables conform to this rule, we can not only drop the flag, but also drop columns that are supposed to be hidden.
> Since for fixing CASSANDRA-15778, and to allow EmptyType column to actually have data[*] we had to remove empty type validation, properly handling compact storage starts making more sense, but we’ll solve it through not having columns, hence not caring about values instead, or keeping values _and_ data, not requiring validation in this case. EmptyType field will have to be handled differently though.
> [*] as it is possible to end up with sstables upgraded from 2.x or written in 3.x before CASSANDRA-15373, which means not every 2.x upgraded or 3.x cluster is guaranteed to have empty values in this column, and this behaviour, even if undesired, might be used by people. 
> Open question is: CASSANDRA-15373 adds validation to EmptyType that disallows any non-empty value to be written to it, but we already allow creating table via CQL, and still write data into it with thrift. It seems to have been unintended, but it might have become a feature people rely on. If we simply back port 15373 to 2.2 and 2.1, we’ll change and will break behaviour. Given no-one complained in 3.0 and 3.11, this assumption is unlikely though. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org