You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "ZhaoYang (Jira)" <ji...@apache.org> on 2020/06/08 18:07:00 UTC

[jira] [Updated] (CASSANDRA-15861) Mutating sstable STATS metadata may race with entire-sstable-streaming(ZCS) causing checksum validation failure

     [ https://issues.apache.org/jira/browse/CASSANDRA-15861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ZhaoYang updated CASSANDRA-15861:
---------------------------------
    Summary: Mutating sstable STATS metadata may race with entire-sstable-streaming(ZCS) causing checksum validation failure  (was: Muting sstable STATS metadata may race with entire-sstable-streaming(ZCS) causing checksum validation failure)

> Mutating sstable STATS metadata may race with entire-sstable-streaming(ZCS) causing checksum validation failure
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15861
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair, Consistency/Streaming
>            Reporter: ZhaoYang
>            Assignee: ZhaoYang
>            Priority: Normal
>             Fix For: 4.0-beta
>
>
> Flaky dtest: [test_dead_sync_initiator - repair_tests.repair_test.TestRepair|https://ci-cassandra.apache.org/view/all/job/Cassandra-devbranch-dtest/143/testReport/junit/dtest.repair_tests.repair_test/TestRepair/test_dead_sync_initiator/]
> In the above test, it executes "nodetool repair" on node1 and kills node2 during repair. At the end, node3 reports checksum validation failure on sstable transferred from node1.
> {code:java|title=what happened}
> 1. When repair started on node1, it performs anti-compaction which modifies sstable's repairAt to 0 and pending repair id to session-id.
> 2. Then node1 creates {{ComponentManifest}} which contains file lengths to be transferred to node3.
> 3. Before node1 actually sends the files to node3, node2 is killed and node1 starts to broadcast repair-failure-message to all participants in {{CoordinatorSession#fail}}
> 4. Node1 receives its own repair-failure-message and fails its local repair sessions at {{LocalSessions#failSession}} which triggers async background compaction.
> 5. Node1's background compaction will mutate sstable's repairAt to 0 and pending repair id to null via  {{PendingRepairManager#getNextRepairFinishedTask}}, as there is no more in-progress repair.
> 6. Node1 actually sends the sstable to node3 where the sstable's STATS component size is different from the original size recorded in the manifest.
> 7. At the end, node3 reports checksum validation failure when it tries to mutate sstable level and "isTransient" attribute in {{CassandraEntireSSTableStreamReader#read}}.
> {code}
> I believe similar race may happen with level compaction where it may directly mutate a sstable's level if it doesn't overlap with sstables at next level. (Note: this isn't a problem in legacy streaming as STATS file length didn't matter.)
> Ideally it will be great to make sstable STATS metadata immutable, just like other sstable components, so we don't have to worry this special case. For now, I suggest to use a {{StatsMetadata}} snapshot when initializing {{CassandraOutgoingFile}} instead of relying on mutable on-disk STATS file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org