You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by GitBox <gi...@apache.org> on 2020/11/17 00:33:29 UTC

[GitHub] [cassandra] maedhroz opened a new pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

maedhroz opened a new pull request #823:
URL: https://github.com/apache/cassandra/pull/823


   Check primary key liveness information only if it exists, and fall back to checking cell contents, which makes skipping possible for COMPACT STORAGE tables after and upgrade to 3.0+


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r525275385



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,80 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);

Review comment:
       Even after dropping compact storage, there is simply no primary key liveness info, and `transform.Filter` will purge the row via `BTreeRow#purge()`. More commentary on this in the Jira.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r524940178



##########
File path: src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java
##########
@@ -1042,16 +1042,26 @@ private boolean canRemoveRow(Row row, Columns requestedColumns, long sstableTime
         // We can remove a row if it has data that is more recent that the next sstable to consider for the data that the query
         // cares about. And the data we care about is 1) the row timestamp (since every query cares if the row exists or not)
         // and 2) the requested columns.
-        if (row.primaryKeyLivenessInfo().isEmpty() || row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp)
+        
+        // Note that compact tables will always have an empty primary key liveness info. 
+        if (!row.primaryKeyLivenessInfo().isEmpty() && row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp) 
             return false;
-
+        
+        boolean hasLiveCell = false;
+        
         for (ColumnDefinition column : requestedColumns)
         {
             Cell cell = row.getCell(column);
-            if (cell == null || cell.timestamp() <= sstableTimestamp)
+            
+            if (cell == null || cell.timestamp() <= sstableTimestamp) 
                 return false;
+            
+            if (!cell.isTombstone()) 
+                hasLiveCell = true;
         }
-        return true;
+
+        // If we've gotten here w/ a compact table or at least one non-tombstone cell, it's safe to remove the row.
+        return hasLiveCell || !metadata().isCQLTable();

Review comment:
       After dropping compact storage, we may have to look at more SSTable than logically necessary, but for tables that never used compact storage, we have to look at older SSTables for primary key liveness info if all we have here are tombstones. Not looking at the older ones would mean we propagate empty primary key liveness info into the response creation, which would cause us to return zero rows instead of rows w/ null cell values.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] michaelsembwever commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
michaelsembwever commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r544997081



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,80 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);

Review comment:
       How come this changed to only need to read one sstable in trunk? ref https://github.com/apache/cassandra/pull/854/files#diff-db622d4d6661216853fe50af2292286c2296643a4d50d486dee6b39b540eafd5R643




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r524940178



##########
File path: src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java
##########
@@ -1042,16 +1042,26 @@ private boolean canRemoveRow(Row row, Columns requestedColumns, long sstableTime
         // We can remove a row if it has data that is more recent that the next sstable to consider for the data that the query
         // cares about. And the data we care about is 1) the row timestamp (since every query cares if the row exists or not)
         // and 2) the requested columns.
-        if (row.primaryKeyLivenessInfo().isEmpty() || row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp)
+        
+        // Note that compact tables will always have an empty primary key liveness info. 
+        if (!row.primaryKeyLivenessInfo().isEmpty() && row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp) 
             return false;
-
+        
+        boolean hasLiveCell = false;
+        
         for (ColumnDefinition column : requestedColumns)
         {
             Cell cell = row.getCell(column);
-            if (cell == null || cell.timestamp() <= sstableTimestamp)
+            
+            if (cell == null || cell.timestamp() <= sstableTimestamp) 
                 return false;
+            
+            if (!cell.isTombstone()) 
+                hasLiveCell = true;
         }
-        return true;
+
+        // If we've gotten here w/ a compact table or at least one non-tombstone cell, it's safe to remove the row.
+        return hasLiveCell || !metadata().isCQLTable();

Review comment:
       After dropping compact storage, we may have to look at more SSTable than logically necessary, but for tables that never used compact storage, we have to look at older SSTables for primary key liveness info if all we have here are tombstones. Not looking at the older ones would mean we propagate empty primary key liveness info into the response creation, which would cause us to return zero rows instead of rows w/ null cell values representing tombstones.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r524808189



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,46 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }

Review comment:
       TODO: More tests around the expected number of SSTables hit for deletion and update cases for both compact tables.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] michaelsembwever commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
michaelsembwever commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r544990034



##########
File path: test/distributed/org/apache/cassandra/distributed/upgrade/CompactStorage2to3UpgradeTest.java
##########
@@ -29,7 +29,10 @@
 import org.apache.cassandra.distributed.api.IMessageFilters;
 import org.apache.cassandra.distributed.api.NodeToolResult;
 import org.apache.cassandra.distributed.shared.Versions;
-import static org.apache.cassandra.distributed.shared.AssertUtils.*;
+
+import static org.apache.cassandra.distributed.shared.AssertUtils.assertRows;
+import static org.apache.cassandra.distributed.shared.AssertUtils.row;
+import static org.junit.Assert.assertEquals;

Review comment:
       nit: can we remove this, it's outside of patch's scope.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r524809272



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -108,10 +109,10 @@ public void testSinglePartitionQuery() throws Throwable
         executeAndCheck("SELECT * FROM %s WHERE pk = 2 AND c > 20 ORDER BY c DESC", 2,
                         row(2, 40, "42"));
 
-        // Test with only 2 of the 3 SSTables being merged and a Name filter
+        // Test with only 1 of the 3 SSTables being merged and a Name filter
         // This test checks the SinglePartitionReadCommand::queryMemtableAndSSTablesInTimestampOrder which is only
         // used for ClusteringIndexNamesFilter when there are no multi-cell columns
-        executeAndCheck("SELECT * FROM %s WHERE pk = 2 AND c = 10", 2,
+        executeAndCheck("SELECT * FROM %s WHERE pk = 2 AND c = 10", 1,

Review comment:
       It doesn't look like there was previously any logical reason to hit more than the latest SSTable here. The update completely specifies the row.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] michaelsembwever commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
michaelsembwever commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r544991255



##########
File path: test/distributed/org/apache/cassandra/distributed/upgrade/CompactStorage2to3UpgradeTest.java
##########
@@ -314,7 +317,6 @@ public void testDropCompactWithClusteringAndValueColumnWithDeletesAndWrites() th
                 }).run();
     }
 
-

Review comment:
       nit: can we remove this, it's outside of patch's scope.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r545267098



##########
File path: src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java
##########
@@ -1042,16 +1042,26 @@ private boolean canRemoveRow(Row row, Columns requestedColumns, long sstableTime
         // We can remove a row if it has data that is more recent that the next sstable to consider for the data that the query
         // cares about. And the data we care about is 1) the row timestamp (since every query cares if the row exists or not)
         // and 2) the requested columns.
-        if (row.primaryKeyLivenessInfo().isEmpty() || row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp)
+        
+        // Note that compact tables will always have an empty primary key liveness info. 
+        if (!row.primaryKeyLivenessInfo().isEmpty() && row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp) 
             return false;
-
+        
+        boolean hasLiveCell = false;
+        
         for (ColumnDefinition column : requestedColumns)
         {
             Cell cell = row.getCell(column);
-            if (cell == null || cell.timestamp() <= sstableTimestamp)
+            
+            if (cell == null || cell.timestamp() <= sstableTimestamp) 
                 return false;
+            
+            if (!cell.isTombstone()) 
+                hasLiveCell = true;
         }
-        return true;
+
+        // If we've gotten here w/ a compact table or at least one non-tombstone cell, it's safe to remove the row.
+        return hasLiveCell || !metadata().isCQLTable();

Review comment:
       @michaelsembwever The easiest way to see this is to change this line to `return true;` and then run `operations.UpgradeTest` and `operations.DeleteTest`. You'll see failures around the issue above. Switching to `return hasLiveCell;` will eliminate that particular problem. Then run `SSTablesIteratedTest`, and you'll see one failure around `testCompactTableCellDeletion` hitting too many SSTables before DROP CS blows away the `isDense` flag. Adding `|| !metadata().isCQLTable()` resolves that. (i.e. When the table is compact, returning no rows rather than is perfectly acceptable. See the [comment below](https://github.com/apache/cassandra/pull/823/files?file-filters%5B%5D=.java#r525275385).)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] michaelsembwever commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
michaelsembwever commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r545035546



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,108 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testNonCompactTableRowDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck))");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);
+    }
+
+    @Test
+    public void testNonCompactTableRangeDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (a int, b int, c int, d int, PRIMARY KEY (a, b, c))");
+        
+        execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", 1, 1, 1, 1);
+        flush();
+        
+        execute("DELETE FROM %s WHERE a=? AND b=?", 1, 1);
+        flush();
+        
+        executeAndCheck("SELECT * FROM %s WHERE a=1 AND b=1 AND c=1", 2);
+    }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);
+    }
+
+    @Test
+    public void testCompactTableCellUpdate() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("UPDATE %s SET v = '2' WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1, row(1, 1, "2"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1, row(1, 1, "2"));
+    }

Review comment:
       what about the additional tests? do any of these offer any value (i haven't spent much time optimising)?
   ```
   
   
       @Test
       public void testCompactTableDeleteOverlappingSSTables() throws Throwable
       {
           createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
   
           execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 1000002");
           flush();
           execute("DELETE FROM %s WHERE pk = 1 AND ck = 51");
           flush();
   
           execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 1000001");
           execute("INSERT INTO %s (pk, ck) VALUES (2, 51)");
           flush();
   
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 2);
   
           execute("ALTER TABLE %s DROP COMPACT STORAGE");
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 2);
       }
   
       @Test
       public void testCompactTableRowDeletion() throws Throwable
       {
           createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
   
           execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
           flush();
   
           execute("DELETE FROM %s WHERE pk = 1 AND ck = 1");
           flush();
   
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
   
           execute("ALTER TABLE %s DROP COMPACT STORAGE");
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
       }
   
       @Test
       public void testCompactTableRangeDeletion() throws Throwable
       {
           createTable("CREATE TABLE %s (a int, b int, c int, d int, PRIMARY KEY (a, b, c)) WITH COMPACT STORAGE");
   
           execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", 1, 1, 1, 1);
           execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", 1, 1, 2, 1);
           execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", 1, 2, 1, 1);
           flush();
   
           execute("DELETE FROM %s WHERE a=? AND b=?", 1, 1);
           flush();
   
           executeAndCheck("SELECT * FROM %s WHERE a=1 AND b=1 AND c=1", 2);
   
           execute("ALTER TABLE %s DROP COMPACT STORAGE");
           executeAndCheck("SELECT * FROM %s WHERE a=1 AND b=1 AND c=1", 2);
       }
   
   
       @Test
       public void testCompactTableRangeOverRowDeletion() throws Throwable
       {
           createTable("CREATE TABLE %s (a int, b int, c int, d int, PRIMARY KEY (a, b, c)) WITH COMPACT STORAGE");
   
           execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", 1, 1, 1, 1);
           execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", 1, 1, 2, 1);
           execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", 1, 2, 1, 1);
           flush();
   
           execute("DELETE FROM %s WHERE a=? AND b=? AND c=?", 1, 1, 1);
           flush();
   
           execute("DELETE FROM %s WHERE a=? AND b=?", 1, 1);
           flush();
   
           executeAndCheck("SELECT * FROM %s WHERE a=1 AND b=1 AND c=1", 3);
   
           execute("ALTER TABLE %s DROP COMPACT STORAGE");
           executeAndCheck("SELECT * FROM %s WHERE a=1 AND b=1 AND c=1", 3);
       }
   
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] michaelsembwever commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
michaelsembwever commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r545049849



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,80 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);

Review comment:
       And, from the ticket…
   
   > One drawback of this approach is that when a compact table with existing SSTables has compact storage dropped, those SSTables still do not contain primary key liveness info, and therefore may continue to return zero rows rather than rows w/ primary keys and null regular columns.
   
   If I've understood that correctly, compact tables before or after a drop compact, will not return rows w/ primary keys and null regular columns. But non-compact tables can.
   
   For example, if we add the following test
   ```
       @Test
       public void testNonCompactTableCellsDeletion() throws Throwable
       {
           createTable("CREATE TABLE %s (pk int, ck int, v1 text, v2 text, PRIMARY KEY (pk, ck))");
   
           execute("INSERT INTO %s (pk, ck, v1, v2) VALUES (1, 1, '1', '1')");
           flush();
   
           execute("DELETE v1 FROM %s WHERE pk = 1 AND ck = 1");
           execute("DELETE v2 FROM %s WHERE pk = 1 AND ck = 1");
           flush();
   
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2, row(1, 1, null, null));
       }
   ```
   
   I'm also curious as to how schema changes impact all of this on drop compact un-upgraded sstables.
   For example
   ```
       @Test
       public void testCompactTableCellDeletionAddColumn() throws Throwable
       {
           createTable("CREATE TABLE %s (pk int, ck int, v1 text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
   
           execute("INSERT INTO %s (pk, ck, v1) VALUES (1, 1, '1')");
           flush();
   
           execute("DELETE v1 FROM %s WHERE pk = 1 AND ck = 1");
           flush();
   
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
   
           execute("ALTER TABLE %s DROP COMPACT STORAGE");
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
   
           execute("ALTER TABLE %s ADD v2 text");
           execute("DELETE v2 FROM %s WHERE pk = 1 AND ck = 1");
           flush();
   
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 3);
       }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r524808189



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,46 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }

Review comment:
       TODO: More tests around the expected number of SSTables hit for deletion and update cases for both compact tables.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] michaelsembwever commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
michaelsembwever commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r545050395



##########
File path: src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java
##########
@@ -1042,16 +1042,26 @@ private boolean canRemoveRow(Row row, Columns requestedColumns, long sstableTime
         // We can remove a row if it has data that is more recent that the next sstable to consider for the data that the query
         // cares about. And the data we care about is 1) the row timestamp (since every query cares if the row exists or not)
         // and 2) the requested columns.
-        if (row.primaryKeyLivenessInfo().isEmpty() || row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp)
+        
+        // Note that compact tables will always have an empty primary key liveness info. 
+        if (!row.primaryKeyLivenessInfo().isEmpty() && row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp) 
             return false;
-
+        
+        boolean hasLiveCell = false;
+        
         for (ColumnDefinition column : requestedColumns)
         {
             Cell cell = row.getCell(column);
-            if (cell == null || cell.timestamp() <= sstableTimestamp)
+            
+            if (cell == null || cell.timestamp() <= sstableTimestamp) 
                 return false;
+            
+            if (!cell.isTombstone()) 
+                hasLiveCell = true;
         }
-        return true;
+
+        // If we've gotten here w/ a compact table or at least one non-tombstone cell, it's safe to remove the row.
+        return hasLiveCell || !metadata().isCQLTable();

Review comment:
       I'm struggling to read which of the unit tests capture that^. Do any of them?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] smiklosovic closed pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
smiklosovic closed pull request #823:
URL: https://github.com/apache/cassandra/pull/823


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r545267098



##########
File path: src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java
##########
@@ -1042,16 +1042,26 @@ private boolean canRemoveRow(Row row, Columns requestedColumns, long sstableTime
         // We can remove a row if it has data that is more recent that the next sstable to consider for the data that the query
         // cares about. And the data we care about is 1) the row timestamp (since every query cares if the row exists or not)
         // and 2) the requested columns.
-        if (row.primaryKeyLivenessInfo().isEmpty() || row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp)
+        
+        // Note that compact tables will always have an empty primary key liveness info. 
+        if (!row.primaryKeyLivenessInfo().isEmpty() && row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp) 
             return false;
-
+        
+        boolean hasLiveCell = false;
+        
         for (ColumnDefinition column : requestedColumns)
         {
             Cell cell = row.getCell(column);
-            if (cell == null || cell.timestamp() <= sstableTimestamp)
+            
+            if (cell == null || cell.timestamp() <= sstableTimestamp) 
                 return false;
+            
+            if (!cell.isTombstone()) 
+                hasLiveCell = true;
         }
-        return true;
+
+        // If we've gotten here w/ a compact table or at least one non-tombstone cell, it's safe to remove the row.
+        return hasLiveCell || !metadata().isCQLTable();

Review comment:
       @michaelsembwever The easiest way to see this is to change this line to `return true;` and then run `operations.UpgradeTest` and `operations.DeleteTest`. You'll see failures around the issue above. Switching to `return hasLiveCell;` will eliminate that particular problem. Then run `SSTablesIteratedTest`, and you'll see one failure around `testCompactTableCellDeletion` hitting too many SSTables before DROP CS blows away the `isDense` flag. Adding `|| !metadata().isCQLTable()` resolves that. (i.e. When the table is compact, returning no rows rather than is perfectly acceptable. See the [comment below](https://github.com/apache/cassandra/pull/823/files?file-filters%5B%5D=.java#r525275385).) Once we DROP CS, we no longer can take advantage of that hint.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r524810301



##########
File path: src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java
##########
@@ -1042,15 +1042,28 @@ private boolean canRemoveRow(Row row, Columns requestedColumns, long sstableTime
         // We can remove a row if it has data that is more recent that the next sstable to consider for the data that the query
         // cares about. And the data we care about is 1) the row timestamp (since every query cares if the row exists or not)
         // and 2) the requested columns.
-        if (row.primaryKeyLivenessInfo().isEmpty() || row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp)
+        
+        boolean isLivenessInfoEmpty = row.primaryKeyLivenessInfo().isEmpty();
+
+        // Compact tables have no concept of primary key liveness info, so this will never be true for them: 
+        if (!isLivenessInfoEmpty && row.primaryKeyLivenessInfo().timestamp() <= sstableTimestamp)
+        {
             return false;
+        }
+        
+        boolean isCQL = metadata().isCQLTable();
 
         for (ColumnDefinition column : requestedColumns)
         {
             Cell cell = row.getCell(column);
-            if (cell == null || cell.timestamp() <= sstableTimestamp)
+            
+            // Non-compact tables with tombstones may need to look at further SSTables to find a live primary key:
+            if (cell == null || cell.timestamp() <= sstableTimestamp || (isCQL && isLivenessInfoEmpty && cell.isTombstone()))

Review comment:
       TODO: This may not be optimal for UPDATE. (i.e. We might still be able to skip the SSTable as long as there is at least some non-tombstone cell.)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r525275385



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,80 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);

Review comment:
       Even after dropping compact storage, there is simply no primary key liveness info, and `transform.Filter` will purge the row via `BTreeRow#purge()`. More commentary on this in the Jira. (Also should probably inline a comment here...)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on pull request #823:
URL: https://github.com/apache/cassandra/pull/823#issuecomment-747623339


   @michaelsembwever I've added your new tests (and one more of my own) in a follow-up commit.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r525275385



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,80 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);

Review comment:
       Even after dropping compact storage, there is simply no primary key liveness info, and `transform.Filter` will purge the row via `BTreeRow#purge()`. More commentary on this in the Jira. (Also should probably inline a comment here before committing...)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] michaelsembwever commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
michaelsembwever commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r545049849



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,80 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);

Review comment:
       And, from the ticket…
   
   > One drawback of this approach is that when a compact table with existing SSTables has compact storage dropped, those SSTables still do not contain primary key liveness info, and therefore may continue to return zero rows rather than rows w/ primary keys and null regular columns.
   
   If I've understood that correctly, compact tables before or after a drop compact, will not return rows w/ primary keys and null regular columns. But non-compact tables can.
   
   For example, if we add the following test
   ```
       @Test
       public void testNonCompactTableCellsDeletion() throws Throwable
       {
           createTable("CREATE TABLE %s (pk int, ck int, v1 text, v2 text, PRIMARY KEY (pk, ck))");
   
           execute("INSERT INTO %s (pk, ck, v1, v2) VALUES (1, 1, '1', '1')");
           flush();
   
           execute("DELETE v1 FROM %s WHERE pk = 1 AND ck = 1");
           execute("DELETE v2 FROM %s WHERE pk = 1 AND ck = 1");
           flush();
   
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2, row(1, 1, null, null));
       }
   ```
   
   I'm also curious as to how schema changes impact all of this on drop compact un-upgraded sstables.
   For example
   ```
   
   
       @Test
       public void testCompactTableCellDeletionAddColumn() throws Throwable
       {
           createTable("CREATE TABLE %s (pk int, ck int, v1 text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
   
           execute("INSERT INTO %s (pk, ck, v1) VALUES (1, 1, '1')");
           flush();
   
           execute("DELETE v1 FROM %s WHERE pk = 1 AND ck = 1");
           flush();
   
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
   
           execute("ALTER TABLE %s DROP COMPACT STORAGE");
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
   
           execute("ALTER TABLE %s ADD v2 text");
           execute("DELETE v2 FROM %s WHERE pk = 1 AND ck = 1");
           flush();
   
           executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 3);
       }
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r545312688



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,108 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testNonCompactTableRowDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck))");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);
+    }
+
+    @Test
+    public void testNonCompactTableRangeDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (a int, b int, c int, d int, PRIMARY KEY (a, b, c))");
+        
+        execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", 1, 1, 1, 1);
+        flush();
+        
+        execute("DELETE FROM %s WHERE a=? AND b=?", 1, 1);
+        flush();
+        
+        executeAndCheck("SELECT * FROM %s WHERE a=1 AND b=1 AND c=1", 2);
+    }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);
+    }
+
+    @Test
+    public void testCompactTableCellUpdate() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("UPDATE %s SET v = '2' WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1, row(1, 1, "2"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1, row(1, 1, "2"));
+    }

Review comment:
       `testCompactTableDeleteOverlappingSSTables` is a more complex version of `testCompactTableCellDeletion`, and it's especially interesting because of the totally useless third SSTable that we have to look at because of the timestamp of the write on the row we're not actually querying. (It also forces us to hit all 3 SSTables after dropping compact storage, rather than just two.) I'll throw an inline comment around that.
   
   `testCompactTableRowDeletion` is nice for completeness, although it also needs to bump the number of expected SSTables hit post-DROP CS.
   
   `testCompactTableRangeDeletion` is also good for completeness, and illustrates that even compact tables have never been able to short-circuit when the delete isn't manifest as a cell tombstone.
   
   `testCompactTableRangeOverRowDeletion` helps illustrate that an older row deletion will be subsumed by the newer range delete and won't factor into SSTable skipping decisions. I've added a complimentary test, `testCompactTableRowOverRangeDeletion` that switches the temporal order of the range and row deletes and only has to hit one SSTable rather than three because of this patch.
   
   Thanks for outlining these cases!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r545235500



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,80 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);

Review comment:
       > How come this changed to only need to read one sstable in trunk?
   
   @michaelsembwever Ah, I should have looked at that one more closely. On trunk, if you comment out the first `executeAndCheck()` execution, the test behaves in line with 3.0 and 3.11. It looks like what's happening is that the prepared statement cache isn't invalidated when DROP COMPACT STORAGE executes on the table and switches the table metadata type. I'm following up w/ @ifesdjeen on that. In the meantime, I'll focus on addressing the other issues in the 3.0 and 3.11 versions of the patch here...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r545270679



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,80 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);

Review comment:
       > If I've understood that correctly, compact tables before or after a drop compact, will not return rows w/ primary keys and null regular columns. But non-compact tables can.
   
   > For example, if we add the following test
   
   Right, and I've added that test. There are also examples of this behavior in `operations.DeleteTest` and `operations.UpdateTest`, which you can see failing if you walk through the [steps above](https://github.com/apache/cassandra/pull/823/files?file-filters%5B%5D=.java#r545267098).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org


[GitHub] [cassandra] maedhroz commented on a change in pull request #823: CASSANDRA-16226 COMPACT STORAGE SSTables created before 3.0 are not correctly skipped by timestamp due to missing primary key liveness info

Posted by GitBox <gi...@apache.org>.
maedhroz commented on a change in pull request #823:
URL: https://github.com/apache/cassandra/pull/823#discussion_r545235500



##########
File path: test/unit/org/apache/cassandra/cql3/validation/miscellaneous/SSTablesIteratedTest.java
##########
@@ -133,4 +134,80 @@ public void testSinglePartitionQuery() throws Throwable
         assertInvalidMessage("ORDER BY is only supported when the partition key is restricted by an EQ or an IN",
                              "SELECT * FROM %s WHERE token(pk) = token(1) ORDER BY C DESC");
     }
+
+    @Test
+    public void testCompactTableSkipping() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1') USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 50, '2') USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 100, '3') USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 2, '4') USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 51, '5') USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 101, '6') USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51, "5"));
+    }
+
+    @Test
+    public void testCompactTableSkippingPkOnly() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 1) USING TIMESTAMP 1000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 50) USING TIMESTAMP 1000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 100) USING TIMESTAMP 1000002");
+        flush();
+
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 2) USING TIMESTAMP 2000000");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 51) USING TIMESTAMP 2000001");
+        execute("INSERT INTO %s (pk, ck) VALUES (1, 101) USING TIMESTAMP 2000002");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 51", 1, row(1, 51));
+    }
+
+    @Test
+    public void testCompactTableCellDeletion() throws Throwable
+    {
+        createTable("CREATE TABLE %s (pk int, ck int, v text, PRIMARY KEY (pk, ck)) WITH COMPACT STORAGE");
+        
+        execute("INSERT INTO %s (pk, ck, v) VALUES (1, 1, '1')");
+        flush();
+
+        execute("DELETE v FROM %s WHERE pk = 1 AND ck = 1");
+        flush();
+
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 1);
+
+        execute("ALTER TABLE %s DROP COMPACT STORAGE");
+        executeAndCheck("SELECT * FROM %s WHERE pk = 1 AND ck = 1", 2);

Review comment:
       > How come this changed to only need to read one sstable in trunk?
   
   @michaelsembwever Ah, I should have looked at that one more closely. On trunk, if you comment out the first `executeAndCheck()` execution, the test behaves in line with 3.0 and 3.11. It looks like what's happening is that the prepared statement cache isn't invalidated when DROP COMPACT STORAGE executes on the table and switches the table metadata type. (The `isDense` flag is updated in-place on 3.0 and 3.11.) I'm following up w/ @ifesdjeen on that. In the meantime, I'll focus on addressing the other issues in the 3.0 and 3.11 versions of the patch here...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org