You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benjamin Lerer (Jira)" <ji...@apache.org> on 2021/05/17 15:23:00 UTC

[jira] [Created] (CASSANDRA-16671) Cassandra can return no row when the row columns have been deleted.

Benjamin Lerer created CASSANDRA-16671:
------------------------------------------

             Summary: Cassandra can return no row when the row columns have been deleted.
                 Key: CASSANDRA-16671
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16671
             Project: Cassandra
          Issue Type: Bug
          Components: Legacy/Local Write-Read Paths
            Reporter: Benjamin Lerer
            Assignee: Benjamin Lerer


It is the semantic of CQL that a (CQL) row exists as long as it has one non-null column (including the PK columns).

To determine if a row has some *non-null primary key*, Cassandra relies on the row primary key liveness. 

For example:

{code}
CREATE TABLE test (pk int, ck int, v int, PRIMARY KEY(pk, ck));
INSERT INTO test(pk, ck, v) VALUES (1, 1, 1);
DELETE v FROM test WHERE pk = 1 AND ck = 1
SELECT v FROM test;
{code}
will return
{code}
v
---
null 
{code}

{{UPDATE}} statements do not set the row primary key liveness by consequence if the user had used and {{UPDATE}} statement instead of an {{INSERT}} the {{SELECT}} query would *not have returned any rows*.

CASSANDRA-16226 introduced a regression by stopping early in the timestamp ordered logic if an {{UPDATE}} statement covering all the columns was found in an SSTable. As the row returned did not have a primary key liveness if another node was also returning a column deletion. The expected row will not be returned.

The problem can be reproduced with the following test:
{code}
   @Test
    public void testSelectWithUpdatedColumnOnOneNodeAndColumnDeletionOnTheOther() throws Throwable
    {
        try (Cluster cluster = init(builder().withNodes(2).start()))
        {
            cluster.schemaChange(withKeyspace("CREATE TABLE %s.tbl (pk int, ck text, v int, PRIMARY KEY (pk, ck))"));
            cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.tbl (pk, ck, v) VALUES (1, '1', 1) USING TIMESTAMP 1000"));
            cluster.get(1).flush(KEYSPACE);
            cluster.get(1).executeInternal(withKeyspace("UPDATE %s.tbl USING TIMESTAMP 2000 SET v = 2 WHERE pk = 1 AND ck = '1'"));
            cluster.get(1).flush(KEYSPACE);

            cluster.get(2).executeInternal(withKeyspace("DELETE v FROM %s.tbl USING TIMESTAMP 3000 WHERE pk=1 AND ck='1'"));
            cluster.get(2).flush(KEYSPACE);

            assertRows(cluster.coordinator(2).execute(withKeyspace("SELECT * FROM %s.tbl WHERE pk=1 AND ck='1'"), ConsistencyLevel.ALL),
                       row(1, "1", null));
            assertRows(cluster.coordinator(2).execute(withKeyspace("SELECT v FROM %s.tbl WHERE pk=1 AND ck='1'"), ConsistencyLevel.ALL),
                       row((Integer) null));

        }
    }
{code}

 cc: [~maedhroz], [~ifesdjeen]





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org