You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by GitBox <gi...@apache.org> on 2020/05/15 16:32:02 UTC

[GitHub] [cassandra-dtest] adelapena commented on a change in pull request #69: CASSANDRA-13606 Added some log checks on idxs management

adelapena commented on a change in pull request #69:
URL: https://github.com/apache/cassandra-dtest/pull/69#discussion_r425230720



##########
File path: secondary_indexes_test.py
##########
@@ -345,6 +345,8 @@ def test_failing_manual_rebuild_index(self):
         # Verify that the index is marked as built and it can answer queries
         assert_one(session, """SELECT table_name, index_name FROM system."IndexInfo" WHERE table_name='k'""", ['k', 'idx'])
         assert_one(session, "SELECT * FROM k.t WHERE v = 1", [0, 1])
+        assert 1 == len(node.grep_log('became queryable'))

Review comment:
       Just mentioning that we have a `assert_length_equal` method in `tools.assertions` providing some pretty printing. I don't have any preference for that one over plain `assert`, so happy to keep it this way.

##########
File path: secondary_indexes_test.py
##########
@@ -371,6 +374,8 @@ def test_failing_manual_rebuild_index(self):
         assert before_files != after_files
         assert_one(session, """SELECT table_name, index_name FROM system."IndexInfo" WHERE table_name='k'""", ['k', 'idx'])
         assert_one(session, "SELECT * FROM k.t WHERE v = 1", [0, 1])
+        assert 2 == len(node.grep_log('became queryable'))

Review comment:
       We can use log marks to just read the fragment of the log file we are interested in:
   ```python
   mark = node.mark_log()
   # play with index
   assert 1 == len(node.grep_log('became queryable', from_mark=mark))
   ```

##########
File path: secondary_indexes_test.py
##########
@@ -345,6 +345,8 @@ def test_failing_manual_rebuild_index(self):
         # Verify that the index is marked as built and it can answer queries
         assert_one(session, """SELECT table_name, index_name FROM system."IndexInfo" WHERE table_name='k'""", ['k', 'idx'])
         assert_one(session, "SELECT * FROM k.t WHERE v = 1", [0, 1])
+        assert 1 == len(node.grep_log('became queryable'))

Review comment:
       It makes me think that perhaps we could extend a bit the log messages with some context info, something like `became queryable after successful build`, `became writable after successful build` and `became not-writable because of failed build`. I guess that would make the logs a bit easier to understand. Not a big deal, though.

##########
File path: secondary_indexes_test.py
##########
@@ -357,7 +359,8 @@ def test_failing_manual_rebuild_index(self):
         assert before_files == after_files
         assert_none(session, """SELECT * FROM system."IndexInfo" WHERE table_name='k'""")
         assert_one(session, "SELECT * FROM k.t WHERE v = 1", [0, 1])
-
+        assert 1 == len(node.grep_log('became not-writable'))

Review comment:
       Since we are changing regular indexes availability for writes, I think we could add a write right after this, and verify that this is not visible for the index:
   ```suggestion
           assert 1 == len(node.grep_log('became not-writable'))
           
           # Insert a row, that row shouldn't be visible by the index because it's not writable
           session.execute("INSERT INTO k.t(k, v) VALUES (3, 1)")
           assert_one(session, "SELECT * FROM k.t WHERE v = 1", [0, 1])
   ```
   It's worth noting that a failed rebuild will let the index queryable but frozen in a point in time. I'm not sure about whether this is desirable or not, and I'm starting to think that perhaps it's better to just preserve the original availabilty-oriented behaviour. In any case, if we wanted to preserve that behaviour, it would be `LoadType.WRITE` after a failure in the initial build, and `LoadType.READ` for failed rebuild/recovery. We don't have that distinction between initial build and rebuilds, but it would be easy to implement adding an argument to `Index#getSupportedLoadTypeOnFailure`:
   ```java
   /**
    * Returns the type of operations supported by the index in case its building has failed and it's needing recovery.
    * 
    * @param isInitialBuild {@code true} if the failure is for the initial build task on index creation, {@code false}
    * if the failure is for a rebuild or recovery.
    */
   default LoadType getSupportedLoadTypeOnFailure(boolean isInitialBuild)
   {
       return isInitialBuild ? LoadType.WRITE : LoadType.ALL;
   }
   ```
   What do you think?

##########
File path: secondary_indexes_test.py
##########
@@ -394,6 +400,7 @@ def test_failing_manual_rebuild_index(self):
         assert before_files != after_files
         assert_one(session, """SELECT table_name, index_name FROM system."IndexInfo" WHERE table_name='k'""", ['k', 'idx'])
         assert_one(session, "SELECT * FROM k.t WHERE v = 1", [0, 1])

Review comment:
       If we have done writes while the index wasn't writable, as I suggested above, those writes should be visible here, after the index recovery:
   ```suggestion
           assert_all(session, "SELECT * FROM k.t WHERE v = 1", [[0, 1], [4, 1], [3, 1]])
   ```
   Unfortunately, the row inserted after the node restart ([4, 1]) will remain not-visible, leaving the index inconsistent with its base table even after a manual recovery. This is because that row was written before the index writable, and it isn't on the sstables yet. I think that, in case of recovery, we need to flush the base table right after setting the index as writable, and before we start indexing the sstables.
   
   This is a problem that apparently we don't see after a failed index creation because index creation is a schema change that produces a flush. However, I wonder if there is a chance to lose data between that schema flush and the registering of the index as writable. If so, that would be independent of the changes introduced by this ticket, and it might be addresses in a separate ticket if we want to limit the scope.

##########
File path: secondary_indexes_test.py
##########
@@ -383,6 +388,7 @@ def test_failing_manual_rebuild_index(self):
         assert before_files == after_files
         assert_none(session, """SELECT * FROM system."IndexInfo" WHERE table_name='k'""")
         assert_one(session, "SELECT * FROM k.t WHERE v = 1", [0, 1])
+        assert 2 == len(node.grep_log('became not-writable'))

Review comment:
       As before, it would be useful to verify that it has indeed become not-writable:
   ```suggestion
           assert 2 == len(node.grep_log('became not-writable'))
           
           # Insert a row, that row shouldn't be visible by the index because it's not writable
           session.execute("INSERT INTO k.t(k, v) VALUES (4, 1)")
           assert_one(session, "SELECT * FROM k.t WHERE v = 1", [0, 1])
   ```
   This write should be visible as soon as we successfully rebuild the index, as we do a few lines later.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org