You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2016/09/14 10:42:21 UTC

[jira] [Commented] (CASSANDRA-12060) Establish consistent distinction between non-existing partition and NULL value for LWTs on static columns

    [ https://issues.apache.org/jira/browse/CASSANDRA-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15490098#comment-15490098 ] 

Sylvain Lebresne commented on CASSANDRA-12060:
----------------------------------------------

I'll have to apologize here because while reviewing this I'm realizing that I've been confused and have pushed for the wrong thing.

My confusion had to do with the fact that CQL doesn't behave as I though it did. Let me explain. This ticket and CASSANDRA-9842 are about what differences CAS queries with conditions on static columns make between existing and non-existing partitions. For that, my guideline was to make it consistent with how CAS queries with conditions on regular columns differenciate between existing and non-existing rows, and that where I was wrong.

Let's consider the following example:
{noformat}
CREATE TABLE test (k int, t int, v int, PRIMARY KEY (k, t));
SELECT v FROM test WHERE k = 0 AND t = 0;
>  v
> ------
>
> (0 rows)
INSERT INTO test(k, t) VALUES (0, 0);
SELECT v FROM test WHERE k = 0 AND t = 0;
>  v
> ------
>  null
>
> (1 rows)
UPDATE test SET v = 42 WHERE k = 0 AND t = 1 IF v = 1;  // Note we try updating another row
>  [applied]
> -----------
>      False
UPDATE test SET v = 42 WHERE k = 0 AND t = 0 IF v = 1; // The row we've inserted before, but v is null
>  [applied] | v
> -----------+------
>      False | null
{noformat}
Both in SELECT and in CAS queries result set, we make a difference between a row existing or not existing. For some reason, I though this difference somewhat carried on to whether a {{IF v = null}} would apply (or any {{IF v != <something>}} condition). I was sure that it was applied if the row existed but had no value for {{v}} but did *not* applied if the row didn't existed at all. In other words, what I _though_ we had was:
{noformat}
UPDATE test SET v = 42 WHERE k = 0 AND t = 1 IF v = null; // Trying to updated a non existing row
>  [applied]
> -----------
>      False
UPDATE test SET v = 42 WHERE k = 0 AND t = 0 IF v = null; // The row exist and v is null
>  [applied]
> -----------
>       True
{noformat}
and I advocated we did the same for static columns and existing/non-existing partitions. But that's *not* how it works and the first query above does apply. That is, even though we always make a difference in result sets between existing row (but with null value) and non-existing row, we don't make one when evaluating if a {{IF v = null}} condition applies.

So anyway, what semantic would be better in theory doesn't matter much. The way {{IF v = null}} currently works for normal column is that it makes no difference between the row existing (and having no value for v) or not existing and it's probably too late to change that, so we should make that work consistently for static column and hence you should just have ignored me on this ticket since that's how it works in 3.0 after CASSANDRA-9842.

With that all said, we still should do what this ticket initiall sets to do: we should make that existing versus not existing partition difference for static columns in CAS result sets because that's how it works for rows/normal columns and we should be consistent.

Doing that last part requires most of the linked patch, though I have the following remarks on said patch:
* I'm not sure I understand the reason for the change in {{CQL3CasRequest.columnsToRead()}}. In particular, the comment talks about having to query a row, but that method is about which columns we bother fetching, not which row we query.
* I'm not fond of using {{null}} for empty partitions since we can just test with {{isEmpty()}} directly. Granted, the current code is confusing as {{appliesTo()}} test for {{null}} even though it's neither passed currently and that's really oversight from CASSANDRA-8099. I think we should stick to {{isEmpty()}} but remove the useless code.
* Regarding querying the first row of a partition when we only have static conditions, I think the code would be easier to follow if we separated static conditions in {{CQL3CasRequest}}. In fact, once we do that, we can also use names queries for "normal" conditions (instead of doing slices of one row, the former being potentially more optimized), which is not all that related to this patch tbh, but is kind of an oversight of the code so we might as well fix it while we're at it.

Anyway, as I felt bad about the back and forth on this, I took the liberty to made the changes related to what's above:
| [12060-3.0-v2|https://github.com/pcmanus/cassandra/commits/12060-3.0-v2] | [utests|http://cassci.datastax.com/job/pcmanus-12060-3.0-v2-testall] | [dtests|http://cassci.datastax.com/job/pcmanus-12060-3.0-v2-dtest] |
| [12060-trunk-v2|https://github.com/pcmanus/cassandra/commits/12060-trunk-v2] | [utests|http://cassci.datastax.com/job/pcmanus-12060-trunk-v2-testall] | [dtests|http://cassci.datastax.com/job/pcmanus-12060-trunk-v2-dtest] |


> Establish consistent distinction between non-existing partition and NULL value for LWTs on static columns
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12060
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12060
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Alex Petrov
>            Assignee: Alex Petrov
>
> When executing following CQL commands: 
> {code}
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '1' };
> USE test;
> CREATE TABLE testtable (a int, b int, s1 int static, s2 int static, v int, PRIMARY KEY (a, b));
> INSERT INTO testtable (a,b,s1,s2,v) VALUES (2,2,2,null,2);
> DELETE s1 FROM testtable WHERE a = 2 IF s2 IN (10,20,30);
> {code}
> The output is different between {{2.x}} and {{3.x}}:
> 2.x:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 2 IF s2 = 5;
>  [applied] | s2
> -----------+------
>      False | null
> {code}
> 3.x:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 2 IF s2 = 5;
>  [applied]
> -----------
>      False
> {code}
> {{2.x}} would although return same result if executed on a partition that does not exist at all:
> {code}
> cqlsh:test> DELETE s1 FROM testtable WHERE a = 5 IF s2 = 5;
>  [applied]
> -----------
>      False
> {code}
> It _might_ be related to static column LWTs, as I could not reproduce same behaviour with non-static column LWTs. The most recent change was [CASSANDRA-10532], which enabled LWT operations on static columns with partition keys only. -Another possible relation is [CASSANDRA-9842], which removed distinction between {{null}} column and non-existing row.- (striked through since same happens on pre-[CASSANDRA-9842] code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)