You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Marcus Eriksson (Jira)" <ji...@apache.org> on 2022/12/15 09:34:00 UTC

[jira] [Updated] (CASSANDRA-18119) Handle sstable metadata stats file getting a new mtime after compaction has finished

     [ https://issues.apache.org/jira/browse/CASSANDRA-18119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcus Eriksson updated CASSANDRA-18119:
----------------------------------------
     Bug Category: Parent values: Degradation(12984)Level 1 values: Other Exception(12998)
       Complexity: Normal
      Component/s: Local/Compaction
                   Local/Startup and Shutdown
    Discovered By: Adhoc Test
    Fix Version/s: 3.11.x
                   4.0.x
                   4.1.x
        Reviewers: Jon Meredith, Josh McKenzie
         Severity: Normal
         Assignee: Marcus Eriksson
           Status: Open  (was: Triage Needed)

> Handle sstable metadata stats file getting a new mtime after compaction has finished
> ------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18119
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18119
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Compaction, Local/Startup and Shutdown
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>            Priority: Normal
>             Fix For: 3.11.x, 4.0.x, 4.1.x
>
>
> Due to a race between compaction finishing and compaction strategies getting reloaded there is a chance that we try to add both the new sstable and the old compacted sstable to the compaction strategy, and in the LCS case this can cause the old sstable to get sent to L0 to avoid overlap. This changes the mtime of the stats metadata file and if the node is shut down before the sstable is actually deleted from disk, we fail starting with the following exception:
> {code}
> .../mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb_txn_compaction_3983c030-7c5a-11ed-8c66-2f5760cb10b3.log
> [junit-timeout]         REMOVE:[.../data/TransactionLogsTest/mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb-0-big-,1671096247000,5][4003386800]
> [junit-timeout]                 ***Unexpected files detected for sstable [nb-0-big-]: last update time [Thu Dec 15 10:24:09 CET 2022] (1671096249000) should have been [Thu Dec 15 10:24:07 CET 2022] (1671096247000)
> [junit-timeout]         ADD:[.../data/TransactionLogsTest/mockcf1-392b3ff07c5a11ed8c662f5760cb10b3/nb-2-big-,0,5][319189529]
> {code}
> A workaround for this (until we properly fix the way compaction strategies get notified about sstable changes) is to ignore the timestamp of the STATS component when cleaning up compaction leftovers on startup. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org