You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Gil Ganz <gi...@gmail.com> on 2023/04/04 07:14:57 UTC

When are sstables that were compacted deleted?

Hey
I have a 4.0.7 cluster in which I see weird behavior.
I expect that once compaction finishes, the old sstables that were part of
the compaction set will be deleted, but it appears they are deleted much
later, thus causing space issues.

For example this is what I have in the log, only one compaction finished,
which took as input 3 sstables, but I can see many sstables were deleted
afterwards:


INFO  [CompactionExecutor:17344] 2023-04-04 04:39:07,345
CompactionTask.java:241 - Compacted (489ddc80-d282-11ed-b4fb-d9d812cda140)
3 sstables to
[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5685-big,]
to level=0.  140.672GiB to 134.554GiB (~95% of original) in 13,894,920ms.
Read Throughput = 10.367MiB/s, Write Throughput = 9.916MiB/s, Row
Throughput = ~-80,450/s.  1,425,146,188 total partitions merged to
1,378,482,110.  Partition merge counts were {1:1339149089, 2:32001964,
3:7331057, }
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:08,151 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5546-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:09,399 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5424-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:10,563 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5677-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,143 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-3360-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,754 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5604-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,328 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5572-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,340 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5684-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,363 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5630-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:14,492 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4991-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:16,338 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-2537-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:18,287 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4171-big
INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:21,467 SSTable.java:111 -
Deleting sstable:
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-1767-big


Am I missing something about how/when sstables are deleted? Is there
something I can do to control when that delete happens?

gil

Re: When are sstables that were compacted deleted?

Posted by Bowen Song via user <us...@cassandra.apache.org>.
It may be useful to attach the output from the nodetool tpstats, 
nodetool compactionstats and nodetool netstats commands output.

If any SSTable involved in a transaction is being compacted, repaired or 
streamed, etc., the transaction clean up will be delayed. This is the 
expected behaviour.


On 05/04/2023 04:50, Gil Ganz wrote:
> In this case the removal process has already finished hours before, so 
> nothing is streaming anymore (but looking at the list of the 
> transaction logs left behind, could be some of the transaction 
> finished while decommission was still running).
> We don't have secondary indexes,and no repairs were running.
> I will try to reproduce it, are there any debug flags you think would 
> help next time?
>
> On Wed, Apr 5, 2023 at 1:03 AM Jeff Jirsa <jj...@gmail.com> wrote:
>
>     You will DEFINITELY not remove sstables obsoleted by compaction if
>     they are being streamed out to neighbors. It would also not
>     surprise me that if you have something holding a background
>     reference to one of the sstables in the oldest/older compaction
>     transaction logs, that the whole process may block waiting on the
>     tidier to clean that up.
>
>     Things that may hold references:
>     - Validation compactions (repair)
>     - Index build/rebuild
>     - Streaming (repair, bootstrap, move, decom)
>
>     If you have repairs running, you can try pausing/cancelling them
>     and/or stopping validation/index_build compactions.
>
>
>
>     On Tue, Apr 4, 2023 at 2:29 PM Gil Ganz <gi...@gmail.com> wrote:
>
>         If it was just one instance I would just bounce it but the
>         thing is this happens when we join/remove nodes, and we have a
>         lot of servers with this issue (while before the join/remove
>         we are on ~50% disk usage).
>         We found ourselves fighting with compaction to make sure we
>         don't run out of space.
>         Will open a ticket, thanks.
>
>
>         On Wed, Apr 5, 2023 at 12:10 AM Jeff Jirsa <jj...@gmail.com>
>         wrote:
>
>             If you restart the node, it'll process/purge those
>             compaction logs on startup, but you want them to
>             purge/process now.
>
>             I genuinely dont know when the tidier runs, but it's
>             likely the case that you're too busy compaction to purge
>             (though I don't know what exactly "too busy" means).
>
>             Since you're close to 95% disk full, bounce one instance
>             at a time to recover the space, but we probably need a
>             JIRA to understand exactly what's blocking the tidier from
>             running.
>
>
>
>             On Tue, Apr 4, 2023 at 1:55 PM Gil Ganz
>             <gi...@gmail.com> wrote:
>
>                 More information - from another node in the cluster
>
>                 I can see many txn files although I only have two
>                 compactions running.
>                 [user@server808
>                 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l *txn*
>                 -rw-r--r-- 1 cassandra cassandra 613 Apr  4 05:26
>                 nb_txn_compaction_09e3aa40-d2a7-11ed-b76b-3b279f6334bc.log
>                 -rw-r--r-- 1 cassandra cassandra 461 Apr  4 10:17
>                 nb_txn_compaction_11433360-d265-11ed-b76b-3b279f6334bc.log
>                 -rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:48
>                 nb_txn_compaction_593e5320-d265-11ed-b76b-3b279f6334bc.log
>                 -rw-r--r-- 1 cassandra cassandra 614 Apr  3 22:47
>                 nb_txn_compaction_701d62d0-d264-11ed-b76b-3b279f6334bc.log
>                 -rw-r--r-- 1 cassandra cassandra 136 Apr  3 22:27
>                 nb_txn_compaction_bb770b50-d26e-11ed-b76b-3b279f6334bc.log
>                 -rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:23
>                 nb_txn_compaction_ce51bfe0-d264-11ed-b76b-3b279f6334bc.log
>                 -rw-r--r-- 1 cassandra cassandra 134 Apr  4 10:31
>                 nb_txn_compaction_d17c7380-d2d3-11ed-b76b-3b279f6334bc.log
>                 -rw-r--r-- 1 cassandra cassandra 464 Apr  4 09:24
>                 nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
>                 -rw-r--r-- 1 cassandra cassandra 613 Apr  3 22:54
>                 nb_txn_compaction_f456f3b0-d271-11ed-b76b-3b279f6334bc.log
>
>                 Let's take for example the one from "Apr  4 09:24"
>                 I can see the matching log message in system.log
>
>                 INFO  [CompactionExecutor:142085] 2023-04-04
>                 09:24:29,892 CompactionTask.java:241 - Compacted
>                 (ed7fc650-d264-11ed-b76b-3b279f6334bc) 2 sstables to
>                 [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big,]
>                 to level=0.  362.987GiB to 336.323GiB (~92% of
>                 original) in 43,625,742ms.  Read Throughput =
>                 8.520MiB/s, Write Throughput = 7.894MiB/s, Row
>                 Throughput = ~-11,482/s.  3,755,353,838 total
>                 partitions merged to 3,479,484,261.  Partition merge
>                 counts were {1:3203614684, 2:275869577, }
>
>
>                 [user@server808
>                 new_table-44263b406bf111ed8bd9df80ace3677a]# cat
>                 nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
>                 ADD:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-,0,8][2633027732]
>                 REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-,1680543675071,8][4190572643]
>                 REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12109-big-,1680554352704,8][3101929253]
>                 COMMIT:[,0,0][2613697770]
>
>                 I would expect sstable 10334 to be gone, since
>                 compaction finished an hour ago, but files are still
>                 there.
>
>
>                 [user@server808
>                 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-*
>                 -rw-r--r-- 1 cassandra cassandra    315582480 Mar 24
>                 16:46
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-CompressionInfo.db
>                 -rw-r--r-- 1 cassandra cassandra 361124597166 Mar 24
>                 16:46
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>                 -rw-r--r-- 1 cassandra cassandra           10 Mar 24
>                 16:46
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Digest.crc32
>                 -rw-r--r-- 1 cassandra cassandra   4316334488 Mar 24
>                 16:46
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Filter.db
>                 -rw-r--r-- 1 cassandra cassandra 283651305837 Mar 24
>                 16:46
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>                 -rw-r--r-- 1 cassandra cassandra        11934 Mar 24
>                 16:46
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Statistics.db
>                 -rw-r--r-- 1 cassandra cassandra    763353245 Apr  3
>                 17:41
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Summary.db
>                 -rw-r--r-- 1 cassandra cassandra           92 Mar 24
>                 16:46
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-TOC.txt
>                 [user@server808
>                 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-*
>                 -rw-r--r-- 1 cassandra cassandra    315582480 Apr  4
>                 09:24
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-CompressionInfo.db
>                 -rw-r--r-- 1 cassandra cassandra 361124597166 Apr  4
>                 09:24
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Data.db
>                 -rw-r--r-- 1 cassandra cassandra           10 Apr  4
>                 09:24
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Digest.crc32
>                 -rw-r--r-- 1 cassandra cassandra   4316334488 Apr  4
>                 09:24
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Filter.db
>                 -rw-r--r-- 1 cassandra cassandra 283651305837 Apr  4
>                 09:24
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Index.db
>                 -rw-r--r-- 1 cassandra cassandra        11934 Apr  4
>                 09:24
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Statistics.db
>                 -rw-r--r-- 1 cassandra cassandra    709470839 Apr  4
>                 09:41
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Summary.db
>                 -rw-r--r-- 1 cassandra cassandra           92 Apr  4
>                 09:24
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-TOC.txt
>
>
>                 Disk space  and du command shows the following :
>
>                 [user@server808 disk1]# df -h
>                 /dev/mapper/data_vg-cass_lv      7.0T  6.6T  411G  95%
>                 /var/lib/cassandra/data/disk1
>                 [user@server808 disk1]# du . -h|grep T
>                 6.6T  ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
>                 6.6T    ./kt_ks
>                 6.6T    .
>
>
>                 If i run sstableutil to clean things up, df command
>                 still shows space as occupied, but now du shows
>                 reduction, and files appear in lsof output
>
>                 [user@server808
>                 new_table-44263b406bf111ed8bd9df80ace3677a]#
>                 sstableutil -c -v kt_ks new_table
>                 Warning: Maximum heap size rounded up to 512 MB
>                 WARN  10:52:18,106 memtable_cleanup_threshold has been
>                 deprecated and should be removed from cassandra.yaml
>                 Cleaning up...
>                 [user@server808
>                 new_table-44263b406bf111ed8bd9df80ace3677a]# df -h
>                 /dev/mapper/data_vg-cass_lv      7.0T  6.6T  432G  94%
>                 /var/lib/cassandra/data/disk1
>                 [user@server808
>                 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l *txn*
>                 ls: cannot access *txn*: No such file or directory
>                 [user@server808 disk1]# du . -h|grep T
>                 3.8T  ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
>                 3.8T    ./kt_ks
>                 3.8T    .
>
>
>                 lsof|grep deleted output contains many rows like this  :
>
>                 LocalPool 38608 40310 cassandra  879r      REG        
>                    253,3 283651305837  4297002780
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>                 (deleted)
>                 LocalPool 38608 40310 cassandra 1061r      REG        
>                    253,3 361124597166  4297002778
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>                 (deleted)
>                 OptionalT 38608 40352 cassandra  879r      REG        
>                    253,3 283651305837  4297002780
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>                 (deleted)
>                 OptionalT 38608 40352 cassandra 1061r      REG        
>                    253,3 361124597166  4297002778
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>                 (deleted
>                 logback-4 38608 40439 cassandra 1061r      REG        
>                    253,3 361124597166  4297002778
>                 /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>                 (deleted)
>
>
>                 I will add that this cluster contains a single table
>                 with twcs.
>
>                 gil
>
>
>                 On Tue, Apr 4, 2023 at 10:14 AM Gil Ganz
>                 <gi...@gmail.com> wrote:
>
>                     Hey
>                     I have a 4.0.7 cluster in which I see weird behavior.
>                     I expect that once compaction finishes, the old
>                     sstables that were part of the compaction set will
>                     be deleted, but it appears they are deleted much
>                     later, thus causing space issues.
>
>                     For example this is what I have in the log, only
>                     one compaction finished, which took as input 3
>                     sstables, but I can see many sstables were deleted
>                     afterwards:
>
>
>                     INFO  [CompactionExecutor:17344] 2023-04-04
>                     04:39:07,345 CompactionTask.java:241 - Compacted
>                     (489ddc80-d282-11ed-b4fb-d9d812cda140) 3 sstables
>                     to
>                     [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5685-big,]
>                     to level=0.  140.672GiB to 134.554GiB (~95% of
>                     original) in 13,894,920ms.  Read Throughput =
>                     10.367MiB/s, Write Throughput = 9.916MiB/s, Row
>                     Throughput = ~-80,450/s.  1,425,146,188 total
>                     partitions merged to 1,378,482,110.  Partition
>                     merge counts were {1:1339149089, 2:32001964,
>                     3:7331057, }
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:08,151
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5546-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:09,399
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5424-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:10,563
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5677-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,143
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-3360-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,754
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5604-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,328
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5572-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,340
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5684-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,363
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5630-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:14,492
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4991-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:16,338
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-2537-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:18,287
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4171-big
>                     INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:21,467
>                     SSTable.java:111 - Deleting sstable:
>                     /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-1767-big
>
>
>                     Am I missing something about how/when sstables are
>                     deleted? Is there something I can do to control
>                     when that delete happens?
>
>                     gil
>

Re: When are sstables that were compacted deleted?

Posted by Gil Ganz <gi...@gmail.com>.
In this case the removal process has already finished hours before, so
nothing is streaming anymore (but looking at the list of the transaction
logs left behind, could be some of the transaction finished while
decommission was still running).
We don't have secondary indexes,and no repairs were running.
I will try to reproduce it, are there any debug flags you think would help
next time?

On Wed, Apr 5, 2023 at 1:03 AM Jeff Jirsa <jj...@gmail.com> wrote:

> You will DEFINITELY not remove sstables obsoleted by compaction if they
> are being streamed out to neighbors. It would also not surprise me that if
> you have something holding a background reference to one of the sstables in
> the oldest/older compaction transaction logs, that the whole process may
> block waiting on the tidier to clean that up.
>
> Things that may hold references:
> - Validation compactions (repair)
> - Index build/rebuild
> - Streaming (repair, bootstrap, move, decom)
>
> If you have repairs running, you can try pausing/cancelling them and/or
> stopping validation/index_build compactions.
>
>
>
> On Tue, Apr 4, 2023 at 2:29 PM Gil Ganz <gi...@gmail.com> wrote:
>
>> If it was just one instance I would just bounce it but the thing is this
>> happens when we join/remove nodes, and we have a lot of servers with this
>> issue (while before the join/remove we are on ~50% disk usage).
>> We found ourselves fighting with compaction to make sure we don't run out
>> of space.
>> Will open a ticket, thanks.
>>
>>
>> On Wed, Apr 5, 2023 at 12:10 AM Jeff Jirsa <jj...@gmail.com> wrote:
>>
>>> If you restart the node, it'll process/purge those compaction logs on
>>> startup, but you want them to purge/process now.
>>>
>>> I genuinely dont know when the tidier runs, but it's likely the case
>>> that you're too busy compaction to purge (though I don't know what exactly
>>> "too busy" means).
>>>
>>> Since you're close to 95% disk full, bounce one instance at a time to
>>> recover the space, but we probably need a JIRA to understand exactly what's
>>> blocking the tidier from running.
>>>
>>>
>>>
>>> On Tue, Apr 4, 2023 at 1:55 PM Gil Ganz <gi...@gmail.com> wrote:
>>>
>>>> More information - from another node in the cluster
>>>>
>>>> I can see many txn files although I only have two compactions running.
>>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
>>>> *txn*
>>>> -rw-r--r-- 1 cassandra cassandra 613 Apr  4 05:26
>>>> nb_txn_compaction_09e3aa40-d2a7-11ed-b76b-3b279f6334bc.log
>>>> -rw-r--r-- 1 cassandra cassandra 461 Apr  4 10:17
>>>> nb_txn_compaction_11433360-d265-11ed-b76b-3b279f6334bc.log
>>>> -rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:48
>>>> nb_txn_compaction_593e5320-d265-11ed-b76b-3b279f6334bc.log
>>>> -rw-r--r-- 1 cassandra cassandra 614 Apr  3 22:47
>>>> nb_txn_compaction_701d62d0-d264-11ed-b76b-3b279f6334bc.log
>>>> -rw-r--r-- 1 cassandra cassandra 136 Apr  3 22:27
>>>> nb_txn_compaction_bb770b50-d26e-11ed-b76b-3b279f6334bc.log
>>>> -rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:23
>>>> nb_txn_compaction_ce51bfe0-d264-11ed-b76b-3b279f6334bc.log
>>>> -rw-r--r-- 1 cassandra cassandra 134 Apr  4 10:31
>>>> nb_txn_compaction_d17c7380-d2d3-11ed-b76b-3b279f6334bc.log
>>>> -rw-r--r-- 1 cassandra cassandra 464 Apr  4 09:24
>>>> nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
>>>> -rw-r--r-- 1 cassandra cassandra 613 Apr  3 22:54
>>>> nb_txn_compaction_f456f3b0-d271-11ed-b76b-3b279f6334bc.log
>>>>
>>>> Let's take for example the one from "Apr  4 09:24"
>>>> I can see the matching log message in system.log
>>>>
>>>> INFO  [CompactionExecutor:142085] 2023-04-04 09:24:29,892
>>>> CompactionTask.java:241 - Compacted (ed7fc650-d264-11ed-b76b-3b279f6334bc)
>>>> 2 sstables to
>>>> [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big,]
>>>> to level=0.  362.987GiB to 336.323GiB (~92% of original) in 43,625,742ms.
>>>> Read Throughput = 8.520MiB/s, Write Throughput = 7.894MiB/s, Row Throughput
>>>> = ~-11,482/s.  3,755,353,838 total partitions merged to 3,479,484,261.
>>>> Partition merge counts were {1:3203614684, 2:275869577, }
>>>>
>>>>
>>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# cat
>>>> nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
>>>>
>>>> ADD:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-,0,8][2633027732]
>>>>
>>>> REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-,1680543675071,8][4190572643]
>>>>
>>>> REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12109-big-,1680554352704,8][3101929253]
>>>> COMMIT:[,0,0][2613697770]
>>>>
>>>> I would expect sstable 10334 to be gone, since compaction finished an
>>>> hour ago, but files are still there.
>>>>
>>>>
>>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-*
>>>> -rw-r--r-- 1 cassandra cassandra    315582480 Mar 24 16:46
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-CompressionInfo.db
>>>> -rw-r--r-- 1 cassandra cassandra 361124597166 Mar 24 16:46
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>>>> -rw-r--r-- 1 cassandra cassandra           10 Mar 24 16:46
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Digest.crc32
>>>> -rw-r--r-- 1 cassandra cassandra   4316334488 Mar 24 16:46
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Filter.db
>>>> -rw-r--r-- 1 cassandra cassandra 283651305837 Mar 24 16:46
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>>>> -rw-r--r-- 1 cassandra cassandra        11934 Mar 24 16:46
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Statistics.db
>>>> -rw-r--r-- 1 cassandra cassandra    763353245 Apr  3 17:41
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Summary.db
>>>> -rw-r--r-- 1 cassandra cassandra           92 Mar 24 16:46
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-TOC.txt
>>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-*
>>>> -rw-r--r-- 1 cassandra cassandra    315582480 Apr  4 09:24
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-CompressionInfo.db
>>>> -rw-r--r-- 1 cassandra cassandra 361124597166 Apr  4 09:24
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Data.db
>>>> -rw-r--r-- 1 cassandra cassandra           10 Apr  4 09:24
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Digest.crc32
>>>> -rw-r--r-- 1 cassandra cassandra   4316334488 Apr  4 09:24
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Filter.db
>>>> -rw-r--r-- 1 cassandra cassandra 283651305837 Apr  4 09:24
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Index.db
>>>> -rw-r--r-- 1 cassandra cassandra        11934 Apr  4 09:24
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Statistics.db
>>>> -rw-r--r-- 1 cassandra cassandra    709470839 Apr  4 09:41
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Summary.db
>>>> -rw-r--r-- 1 cassandra cassandra           92 Apr  4 09:24
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-TOC.txt
>>>>
>>>>
>>>> Disk space  and du command shows the following :
>>>>
>>>> [user@server808 disk1]# df -h
>>>> /dev/mapper/data_vg-cass_lv      7.0T  6.6T  411G  95%
>>>> /var/lib/cassandra/data/disk1
>>>> [user@server808 disk1]# du . -h|grep T
>>>> 6.6T    ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
>>>> 6.6T    ./kt_ks
>>>> 6.6T    .
>>>>
>>>>
>>>> If i run sstableutil to clean things up, df command still shows space
>>>> as occupied, but now du shows reduction, and files appear in lsof output
>>>>
>>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]#
>>>> sstableutil -c -v kt_ks new_table
>>>> Warning: Maximum heap size rounded up to 512 MB
>>>> WARN  10:52:18,106 memtable_cleanup_threshold has been deprecated and
>>>> should be removed from cassandra.yaml
>>>> Cleaning up...
>>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# df -h
>>>> /dev/mapper/data_vg-cass_lv      7.0T  6.6T  432G  94%
>>>> /var/lib/cassandra/data/disk1
>>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
>>>> *txn*
>>>> ls: cannot access *txn*: No such file or directory
>>>> [user@server808 disk1]# du . -h|grep T
>>>> 3.8T    ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
>>>> 3.8T    ./kt_ks
>>>> 3.8T    .
>>>>
>>>>
>>>> lsof|grep deleted output contains many rows like this  :
>>>>
>>>> LocalPool 38608 40310 cassandra  879r      REG              253,3
>>>> 283651305837  4297002780
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>>>> (deleted)
>>>> LocalPool 38608 40310 cassandra 1061r      REG              253,3
>>>> 361124597166  4297002778
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>>>> (deleted)
>>>> OptionalT 38608 40352 cassandra  879r      REG              253,3
>>>> 283651305837  4297002780
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>>>> (deleted)
>>>> OptionalT 38608 40352 cassandra 1061r      REG              253,3
>>>> 361124597166  4297002778
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>>>> (deleted
>>>> logback-4 38608 40439 cassandra 1061r      REG              253,3
>>>> 361124597166  4297002778
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>>>> (deleted)
>>>>
>>>>
>>>> I will add that this cluster contains a single table with twcs.
>>>>
>>>> gil
>>>>
>>>>
>>>> On Tue, Apr 4, 2023 at 10:14 AM Gil Ganz <gi...@gmail.com> wrote:
>>>>
>>>>> Hey
>>>>> I have a 4.0.7 cluster in which I see weird behavior.
>>>>> I expect that once compaction finishes, the old sstables that were
>>>>> part of the compaction set will be deleted, but it appears they are deleted
>>>>> much later, thus causing space issues.
>>>>>
>>>>> For example this is what I have in the log, only one compaction
>>>>> finished, which took as input 3 sstables, but I can see many sstables were
>>>>> deleted afterwards:
>>>>>
>>>>>
>>>>> INFO  [CompactionExecutor:17344] 2023-04-04 04:39:07,345
>>>>> CompactionTask.java:241 - Compacted (489ddc80-d282-11ed-b4fb-d9d812cda140)
>>>>> 3 sstables to
>>>>> [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5685-big,]
>>>>> to level=0.  140.672GiB to 134.554GiB (~95% of original) in 13,894,920ms.
>>>>> Read Throughput = 10.367MiB/s, Write Throughput = 9.916MiB/s, Row
>>>>> Throughput = ~-80,450/s.  1,425,146,188 total partitions merged to
>>>>> 1,378,482,110.  Partition merge counts were {1:1339149089, 2:32001964,
>>>>> 3:7331057, }
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:08,151 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5546-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:09,399 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5424-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:10,563 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5677-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,143 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-3360-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,754 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5604-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,328 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5572-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,340 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5684-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,363 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5630-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:14,492 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4991-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:16,338 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-2537-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:18,287 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4171-big
>>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:21,467 SSTable.java:111 -
>>>>> Deleting sstable:
>>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-1767-big
>>>>>
>>>>>
>>>>> Am I missing something about how/when sstables are deleted? Is there
>>>>> something I can do to control when that delete happens?
>>>>>
>>>>> gil
>>>>>
>>>>

Re: When are sstables that were compacted deleted?

Posted by Jeff Jirsa <jj...@gmail.com>.
You will DEFINITELY not remove sstables obsoleted by compaction if they are
being streamed out to neighbors. It would also not surprise me that if you
have something holding a background reference to one of the sstables in the
oldest/older compaction transaction logs, that the whole process may block
waiting on the tidier to clean that up.

Things that may hold references:
- Validation compactions (repair)
- Index build/rebuild
- Streaming (repair, bootstrap, move, decom)

If you have repairs running, you can try pausing/cancelling them and/or
stopping validation/index_build compactions.



On Tue, Apr 4, 2023 at 2:29 PM Gil Ganz <gi...@gmail.com> wrote:

> If it was just one instance I would just bounce it but the thing is this
> happens when we join/remove nodes, and we have a lot of servers with this
> issue (while before the join/remove we are on ~50% disk usage).
> We found ourselves fighting with compaction to make sure we don't run out
> of space.
> Will open a ticket, thanks.
>
>
> On Wed, Apr 5, 2023 at 12:10 AM Jeff Jirsa <jj...@gmail.com> wrote:
>
>> If you restart the node, it'll process/purge those compaction logs on
>> startup, but you want them to purge/process now.
>>
>> I genuinely dont know when the tidier runs, but it's likely the case that
>> you're too busy compaction to purge (though I don't know what exactly "too
>> busy" means).
>>
>> Since you're close to 95% disk full, bounce one instance at a time to
>> recover the space, but we probably need a JIRA to understand exactly what's
>> blocking the tidier from running.
>>
>>
>>
>> On Tue, Apr 4, 2023 at 1:55 PM Gil Ganz <gi...@gmail.com> wrote:
>>
>>> More information - from another node in the cluster
>>>
>>> I can see many txn files although I only have two compactions running.
>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l *txn*
>>> -rw-r--r-- 1 cassandra cassandra 613 Apr  4 05:26
>>> nb_txn_compaction_09e3aa40-d2a7-11ed-b76b-3b279f6334bc.log
>>> -rw-r--r-- 1 cassandra cassandra 461 Apr  4 10:17
>>> nb_txn_compaction_11433360-d265-11ed-b76b-3b279f6334bc.log
>>> -rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:48
>>> nb_txn_compaction_593e5320-d265-11ed-b76b-3b279f6334bc.log
>>> -rw-r--r-- 1 cassandra cassandra 614 Apr  3 22:47
>>> nb_txn_compaction_701d62d0-d264-11ed-b76b-3b279f6334bc.log
>>> -rw-r--r-- 1 cassandra cassandra 136 Apr  3 22:27
>>> nb_txn_compaction_bb770b50-d26e-11ed-b76b-3b279f6334bc.log
>>> -rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:23
>>> nb_txn_compaction_ce51bfe0-d264-11ed-b76b-3b279f6334bc.log
>>> -rw-r--r-- 1 cassandra cassandra 134 Apr  4 10:31
>>> nb_txn_compaction_d17c7380-d2d3-11ed-b76b-3b279f6334bc.log
>>> -rw-r--r-- 1 cassandra cassandra 464 Apr  4 09:24
>>> nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
>>> -rw-r--r-- 1 cassandra cassandra 613 Apr  3 22:54
>>> nb_txn_compaction_f456f3b0-d271-11ed-b76b-3b279f6334bc.log
>>>
>>> Let's take for example the one from "Apr  4 09:24"
>>> I can see the matching log message in system.log
>>>
>>> INFO  [CompactionExecutor:142085] 2023-04-04 09:24:29,892
>>> CompactionTask.java:241 - Compacted (ed7fc650-d264-11ed-b76b-3b279f6334bc)
>>> 2 sstables to
>>> [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big,]
>>> to level=0.  362.987GiB to 336.323GiB (~92% of original) in 43,625,742ms.
>>> Read Throughput = 8.520MiB/s, Write Throughput = 7.894MiB/s, Row Throughput
>>> = ~-11,482/s.  3,755,353,838 total partitions merged to 3,479,484,261.
>>> Partition merge counts were {1:3203614684, 2:275869577, }
>>>
>>>
>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# cat
>>> nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
>>>
>>> ADD:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-,0,8][2633027732]
>>>
>>> REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-,1680543675071,8][4190572643]
>>>
>>> REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12109-big-,1680554352704,8][3101929253]
>>> COMMIT:[,0,0][2613697770]
>>>
>>> I would expect sstable 10334 to be gone, since compaction finished an
>>> hour ago, but files are still there.
>>>
>>>
>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-*
>>> -rw-r--r-- 1 cassandra cassandra    315582480 Mar 24 16:46
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-CompressionInfo.db
>>> -rw-r--r-- 1 cassandra cassandra 361124597166 Mar 24 16:46
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>>> -rw-r--r-- 1 cassandra cassandra           10 Mar 24 16:46
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Digest.crc32
>>> -rw-r--r-- 1 cassandra cassandra   4316334488 Mar 24 16:46
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Filter.db
>>> -rw-r--r-- 1 cassandra cassandra 283651305837 Mar 24 16:46
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>>> -rw-r--r-- 1 cassandra cassandra        11934 Mar 24 16:46
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Statistics.db
>>> -rw-r--r-- 1 cassandra cassandra    763353245 Apr  3 17:41
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Summary.db
>>> -rw-r--r-- 1 cassandra cassandra           92 Mar 24 16:46
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-TOC.txt
>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-*
>>> -rw-r--r-- 1 cassandra cassandra    315582480 Apr  4 09:24
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-CompressionInfo.db
>>> -rw-r--r-- 1 cassandra cassandra 361124597166 Apr  4 09:24
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Data.db
>>> -rw-r--r-- 1 cassandra cassandra           10 Apr  4 09:24
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Digest.crc32
>>> -rw-r--r-- 1 cassandra cassandra   4316334488 Apr  4 09:24
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Filter.db
>>> -rw-r--r-- 1 cassandra cassandra 283651305837 Apr  4 09:24
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Index.db
>>> -rw-r--r-- 1 cassandra cassandra        11934 Apr  4 09:24
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Statistics.db
>>> -rw-r--r-- 1 cassandra cassandra    709470839 Apr  4 09:41
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Summary.db
>>> -rw-r--r-- 1 cassandra cassandra           92 Apr  4 09:24
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-TOC.txt
>>>
>>>
>>> Disk space  and du command shows the following :
>>>
>>> [user@server808 disk1]# df -h
>>> /dev/mapper/data_vg-cass_lv      7.0T  6.6T  411G  95%
>>> /var/lib/cassandra/data/disk1
>>> [user@server808 disk1]# du . -h|grep T
>>> 6.6T    ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
>>> 6.6T    ./kt_ks
>>> 6.6T    .
>>>
>>>
>>> If i run sstableutil to clean things up, df command still shows space as
>>> occupied, but now du shows reduction, and files appear in lsof output
>>>
>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]#
>>> sstableutil -c -v kt_ks new_table
>>> Warning: Maximum heap size rounded up to 512 MB
>>> WARN  10:52:18,106 memtable_cleanup_threshold has been deprecated and
>>> should be removed from cassandra.yaml
>>> Cleaning up...
>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# df -h
>>> /dev/mapper/data_vg-cass_lv      7.0T  6.6T  432G  94%
>>> /var/lib/cassandra/data/disk1
>>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l *txn*
>>> ls: cannot access *txn*: No such file or directory
>>> [user@server808 disk1]# du . -h|grep T
>>> 3.8T    ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
>>> 3.8T    ./kt_ks
>>> 3.8T    .
>>>
>>>
>>> lsof|grep deleted output contains many rows like this  :
>>>
>>> LocalPool 38608 40310 cassandra  879r      REG              253,3
>>> 283651305837  4297002780
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>>> (deleted)
>>> LocalPool 38608 40310 cassandra 1061r      REG              253,3
>>> 361124597166  4297002778
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>>> (deleted)
>>> OptionalT 38608 40352 cassandra  879r      REG              253,3
>>> 283651305837  4297002780
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>>> (deleted)
>>> OptionalT 38608 40352 cassandra 1061r      REG              253,3
>>> 361124597166  4297002778
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>>> (deleted
>>> logback-4 38608 40439 cassandra 1061r      REG              253,3
>>> 361124597166  4297002778
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>>> (deleted)
>>>
>>>
>>> I will add that this cluster contains a single table with twcs.
>>>
>>> gil
>>>
>>>
>>> On Tue, Apr 4, 2023 at 10:14 AM Gil Ganz <gi...@gmail.com> wrote:
>>>
>>>> Hey
>>>> I have a 4.0.7 cluster in which I see weird behavior.
>>>> I expect that once compaction finishes, the old sstables that were part
>>>> of the compaction set will be deleted, but it appears they are deleted much
>>>> later, thus causing space issues.
>>>>
>>>> For example this is what I have in the log, only one compaction
>>>> finished, which took as input 3 sstables, but I can see many sstables were
>>>> deleted afterwards:
>>>>
>>>>
>>>> INFO  [CompactionExecutor:17344] 2023-04-04 04:39:07,345
>>>> CompactionTask.java:241 - Compacted (489ddc80-d282-11ed-b4fb-d9d812cda140)
>>>> 3 sstables to
>>>> [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5685-big,]
>>>> to level=0.  140.672GiB to 134.554GiB (~95% of original) in 13,894,920ms.
>>>> Read Throughput = 10.367MiB/s, Write Throughput = 9.916MiB/s, Row
>>>> Throughput = ~-80,450/s.  1,425,146,188 total partitions merged to
>>>> 1,378,482,110.  Partition merge counts were {1:1339149089, 2:32001964,
>>>> 3:7331057, }
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:08,151 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5546-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:09,399 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5424-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:10,563 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5677-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,143 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-3360-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,754 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5604-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,328 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5572-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,340 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5684-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,363 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5630-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:14,492 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4991-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:16,338 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-2537-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:18,287 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4171-big
>>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:21,467 SSTable.java:111 -
>>>> Deleting sstable:
>>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-1767-big
>>>>
>>>>
>>>> Am I missing something about how/when sstables are deleted? Is there
>>>> something I can do to control when that delete happens?
>>>>
>>>> gil
>>>>
>>>

Re: When are sstables that were compacted deleted?

Posted by Gil Ganz <gi...@gmail.com>.
If it was just one instance I would just bounce it but the thing is this
happens when we join/remove nodes, and we have a lot of servers with this
issue (while before the join/remove we are on ~50% disk usage).
We found ourselves fighting with compaction to make sure we don't run out
of space.
Will open a ticket, thanks.


On Wed, Apr 5, 2023 at 12:10 AM Jeff Jirsa <jj...@gmail.com> wrote:

> If you restart the node, it'll process/purge those compaction logs on
> startup, but you want them to purge/process now.
>
> I genuinely dont know when the tidier runs, but it's likely the case that
> you're too busy compaction to purge (though I don't know what exactly "too
> busy" means).
>
> Since you're close to 95% disk full, bounce one instance at a time to
> recover the space, but we probably need a JIRA to understand exactly what's
> blocking the tidier from running.
>
>
>
> On Tue, Apr 4, 2023 at 1:55 PM Gil Ganz <gi...@gmail.com> wrote:
>
>> More information - from another node in the cluster
>>
>> I can see many txn files although I only have two compactions running.
>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l *txn*
>> -rw-r--r-- 1 cassandra cassandra 613 Apr  4 05:26
>> nb_txn_compaction_09e3aa40-d2a7-11ed-b76b-3b279f6334bc.log
>> -rw-r--r-- 1 cassandra cassandra 461 Apr  4 10:17
>> nb_txn_compaction_11433360-d265-11ed-b76b-3b279f6334bc.log
>> -rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:48
>> nb_txn_compaction_593e5320-d265-11ed-b76b-3b279f6334bc.log
>> -rw-r--r-- 1 cassandra cassandra 614 Apr  3 22:47
>> nb_txn_compaction_701d62d0-d264-11ed-b76b-3b279f6334bc.log
>> -rw-r--r-- 1 cassandra cassandra 136 Apr  3 22:27
>> nb_txn_compaction_bb770b50-d26e-11ed-b76b-3b279f6334bc.log
>> -rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:23
>> nb_txn_compaction_ce51bfe0-d264-11ed-b76b-3b279f6334bc.log
>> -rw-r--r-- 1 cassandra cassandra 134 Apr  4 10:31
>> nb_txn_compaction_d17c7380-d2d3-11ed-b76b-3b279f6334bc.log
>> -rw-r--r-- 1 cassandra cassandra 464 Apr  4 09:24
>> nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
>> -rw-r--r-- 1 cassandra cassandra 613 Apr  3 22:54
>> nb_txn_compaction_f456f3b0-d271-11ed-b76b-3b279f6334bc.log
>>
>> Let's take for example the one from "Apr  4 09:24"
>> I can see the matching log message in system.log
>>
>> INFO  [CompactionExecutor:142085] 2023-04-04 09:24:29,892
>> CompactionTask.java:241 - Compacted (ed7fc650-d264-11ed-b76b-3b279f6334bc)
>> 2 sstables to
>> [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big,]
>> to level=0.  362.987GiB to 336.323GiB (~92% of original) in 43,625,742ms.
>> Read Throughput = 8.520MiB/s, Write Throughput = 7.894MiB/s, Row Throughput
>> = ~-11,482/s.  3,755,353,838 total partitions merged to 3,479,484,261.
>> Partition merge counts were {1:3203614684, 2:275869577, }
>>
>>
>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# cat
>> nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
>>
>> ADD:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-,0,8][2633027732]
>>
>> REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-,1680543675071,8][4190572643]
>>
>> REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12109-big-,1680554352704,8][3101929253]
>> COMMIT:[,0,0][2613697770]
>>
>> I would expect sstable 10334 to be gone, since compaction finished an
>> hour ago, but files are still there.
>>
>>
>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-*
>> -rw-r--r-- 1 cassandra cassandra    315582480 Mar 24 16:46
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-CompressionInfo.db
>> -rw-r--r-- 1 cassandra cassandra 361124597166 Mar 24 16:46
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>> -rw-r--r-- 1 cassandra cassandra           10 Mar 24 16:46
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Digest.crc32
>> -rw-r--r-- 1 cassandra cassandra   4316334488 Mar 24 16:46
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Filter.db
>> -rw-r--r-- 1 cassandra cassandra 283651305837 Mar 24 16:46
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>> -rw-r--r-- 1 cassandra cassandra        11934 Mar 24 16:46
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Statistics.db
>> -rw-r--r-- 1 cassandra cassandra    763353245 Apr  3 17:41
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Summary.db
>> -rw-r--r-- 1 cassandra cassandra           92 Mar 24 16:46
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-TOC.txt
>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-*
>> -rw-r--r-- 1 cassandra cassandra    315582480 Apr  4 09:24
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-CompressionInfo.db
>> -rw-r--r-- 1 cassandra cassandra 361124597166 Apr  4 09:24
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Data.db
>> -rw-r--r-- 1 cassandra cassandra           10 Apr  4 09:24
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Digest.crc32
>> -rw-r--r-- 1 cassandra cassandra   4316334488 Apr  4 09:24
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Filter.db
>> -rw-r--r-- 1 cassandra cassandra 283651305837 Apr  4 09:24
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Index.db
>> -rw-r--r-- 1 cassandra cassandra        11934 Apr  4 09:24
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Statistics.db
>> -rw-r--r-- 1 cassandra cassandra    709470839 Apr  4 09:41
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Summary.db
>> -rw-r--r-- 1 cassandra cassandra           92 Apr  4 09:24
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-TOC.txt
>>
>>
>> Disk space  and du command shows the following :
>>
>> [user@server808 disk1]# df -h
>> /dev/mapper/data_vg-cass_lv      7.0T  6.6T  411G  95%
>> /var/lib/cassandra/data/disk1
>> [user@server808 disk1]# du . -h|grep T
>> 6.6T    ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
>> 6.6T    ./kt_ks
>> 6.6T    .
>>
>>
>> If i run sstableutil to clean things up, df command still shows space as
>> occupied, but now du shows reduction, and files appear in lsof output
>>
>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# sstableutil
>> -c -v kt_ks new_table
>> Warning: Maximum heap size rounded up to 512 MB
>> WARN  10:52:18,106 memtable_cleanup_threshold has been deprecated and
>> should be removed from cassandra.yaml
>> Cleaning up...
>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# df -h
>> /dev/mapper/data_vg-cass_lv      7.0T  6.6T  432G  94%
>> /var/lib/cassandra/data/disk1
>> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l *txn*
>> ls: cannot access *txn*: No such file or directory
>> [user@server808 disk1]# du . -h|grep T
>> 3.8T    ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
>> 3.8T    ./kt_ks
>> 3.8T    .
>>
>>
>> lsof|grep deleted output contains many rows like this  :
>>
>> LocalPool 38608 40310 cassandra  879r      REG              253,3
>> 283651305837  4297002780
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>> (deleted)
>> LocalPool 38608 40310 cassandra 1061r      REG              253,3
>> 361124597166  4297002778
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>> (deleted)
>> OptionalT 38608 40352 cassandra  879r      REG              253,3
>> 283651305837  4297002780
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
>> (deleted)
>> OptionalT 38608 40352 cassandra 1061r      REG              253,3
>> 361124597166  4297002778
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>> (deleted
>> logback-4 38608 40439 cassandra 1061r      REG              253,3
>> 361124597166  4297002778
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
>> (deleted)
>>
>>
>> I will add that this cluster contains a single table with twcs.
>>
>> gil
>>
>>
>> On Tue, Apr 4, 2023 at 10:14 AM Gil Ganz <gi...@gmail.com> wrote:
>>
>>> Hey
>>> I have a 4.0.7 cluster in which I see weird behavior.
>>> I expect that once compaction finishes, the old sstables that were part
>>> of the compaction set will be deleted, but it appears they are deleted much
>>> later, thus causing space issues.
>>>
>>> For example this is what I have in the log, only one compaction
>>> finished, which took as input 3 sstables, but I can see many sstables were
>>> deleted afterwards:
>>>
>>>
>>> INFO  [CompactionExecutor:17344] 2023-04-04 04:39:07,345
>>> CompactionTask.java:241 - Compacted (489ddc80-d282-11ed-b4fb-d9d812cda140)
>>> 3 sstables to
>>> [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5685-big,]
>>> to level=0.  140.672GiB to 134.554GiB (~95% of original) in 13,894,920ms.
>>> Read Throughput = 10.367MiB/s, Write Throughput = 9.916MiB/s, Row
>>> Throughput = ~-80,450/s.  1,425,146,188 total partitions merged to
>>> 1,378,482,110.  Partition merge counts were {1:1339149089, 2:32001964,
>>> 3:7331057, }
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:08,151 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5546-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:09,399 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5424-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:10,563 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5677-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,143 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-3360-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,754 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5604-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,328 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5572-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,340 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5684-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,363 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5630-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:14,492 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4991-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:16,338 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-2537-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:18,287 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4171-big
>>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:21,467 SSTable.java:111 -
>>> Deleting sstable:
>>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-1767-big
>>>
>>>
>>> Am I missing something about how/when sstables are deleted? Is there
>>> something I can do to control when that delete happens?
>>>
>>> gil
>>>
>>

Re: When are sstables that were compacted deleted?

Posted by Jeff Jirsa <jj...@gmail.com>.
If you restart the node, it'll process/purge those compaction logs on
startup, but you want them to purge/process now.

I genuinely dont know when the tidier runs, but it's likely the case that
you're too busy compaction to purge (though I don't know what exactly "too
busy" means).

Since you're close to 95% disk full, bounce one instance at a time to
recover the space, but we probably need a JIRA to understand exactly what's
blocking the tidier from running.



On Tue, Apr 4, 2023 at 1:55 PM Gil Ganz <gi...@gmail.com> wrote:

> More information - from another node in the cluster
>
> I can see many txn files although I only have two compactions running.
> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l *txn*
> -rw-r--r-- 1 cassandra cassandra 613 Apr  4 05:26
> nb_txn_compaction_09e3aa40-d2a7-11ed-b76b-3b279f6334bc.log
> -rw-r--r-- 1 cassandra cassandra 461 Apr  4 10:17
> nb_txn_compaction_11433360-d265-11ed-b76b-3b279f6334bc.log
> -rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:48
> nb_txn_compaction_593e5320-d265-11ed-b76b-3b279f6334bc.log
> -rw-r--r-- 1 cassandra cassandra 614 Apr  3 22:47
> nb_txn_compaction_701d62d0-d264-11ed-b76b-3b279f6334bc.log
> -rw-r--r-- 1 cassandra cassandra 136 Apr  3 22:27
> nb_txn_compaction_bb770b50-d26e-11ed-b76b-3b279f6334bc.log
> -rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:23
> nb_txn_compaction_ce51bfe0-d264-11ed-b76b-3b279f6334bc.log
> -rw-r--r-- 1 cassandra cassandra 134 Apr  4 10:31
> nb_txn_compaction_d17c7380-d2d3-11ed-b76b-3b279f6334bc.log
> -rw-r--r-- 1 cassandra cassandra 464 Apr  4 09:24
> nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
> -rw-r--r-- 1 cassandra cassandra 613 Apr  3 22:54
> nb_txn_compaction_f456f3b0-d271-11ed-b76b-3b279f6334bc.log
>
> Let's take for example the one from "Apr  4 09:24"
> I can see the matching log message in system.log
>
> INFO  [CompactionExecutor:142085] 2023-04-04 09:24:29,892
> CompactionTask.java:241 - Compacted (ed7fc650-d264-11ed-b76b-3b279f6334bc)
> 2 sstables to
> [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big,]
> to level=0.  362.987GiB to 336.323GiB (~92% of original) in 43,625,742ms.
> Read Throughput = 8.520MiB/s, Write Throughput = 7.894MiB/s, Row Throughput
> = ~-11,482/s.  3,755,353,838 total partitions merged to 3,479,484,261.
> Partition merge counts were {1:3203614684, 2:275869577, }
>
>
> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# cat
> nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
>
> ADD:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-,0,8][2633027732]
>
> REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-,1680543675071,8][4190572643]
>
> REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12109-big-,1680554352704,8][3101929253]
> COMMIT:[,0,0][2613697770]
>
> I would expect sstable 10334 to be gone, since compaction finished an hour
> ago, but files are still there.
>
>
> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-*
> -rw-r--r-- 1 cassandra cassandra    315582480 Mar 24 16:46
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-CompressionInfo.db
> -rw-r--r-- 1 cassandra cassandra 361124597166 Mar 24 16:46
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
> -rw-r--r-- 1 cassandra cassandra           10 Mar 24 16:46
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Digest.crc32
> -rw-r--r-- 1 cassandra cassandra   4316334488 Mar 24 16:46
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Filter.db
> -rw-r--r-- 1 cassandra cassandra 283651305837 Mar 24 16:46
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
> -rw-r--r-- 1 cassandra cassandra        11934 Mar 24 16:46
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Statistics.db
> -rw-r--r-- 1 cassandra cassandra    763353245 Apr  3 17:41
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Summary.db
> -rw-r--r-- 1 cassandra cassandra           92 Mar 24 16:46
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-TOC.txt
> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-*
> -rw-r--r-- 1 cassandra cassandra    315582480 Apr  4 09:24
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-CompressionInfo.db
> -rw-r--r-- 1 cassandra cassandra 361124597166 Apr  4 09:24
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Data.db
> -rw-r--r-- 1 cassandra cassandra           10 Apr  4 09:24
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Digest.crc32
> -rw-r--r-- 1 cassandra cassandra   4316334488 Apr  4 09:24
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Filter.db
> -rw-r--r-- 1 cassandra cassandra 283651305837 Apr  4 09:24
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Index.db
> -rw-r--r-- 1 cassandra cassandra        11934 Apr  4 09:24
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Statistics.db
> -rw-r--r-- 1 cassandra cassandra    709470839 Apr  4 09:41
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Summary.db
> -rw-r--r-- 1 cassandra cassandra           92 Apr  4 09:24
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-TOC.txt
>
>
> Disk space  and du command shows the following :
>
> [user@server808 disk1]# df -h
> /dev/mapper/data_vg-cass_lv      7.0T  6.6T  411G  95%
> /var/lib/cassandra/data/disk1
> [user@server808 disk1]# du . -h|grep T
> 6.6T    ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
> 6.6T    ./kt_ks
> 6.6T    .
>
>
> If i run sstableutil to clean things up, df command still shows space as
> occupied, but now du shows reduction, and files appear in lsof output
>
> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# sstableutil
> -c -v kt_ks new_table
> Warning: Maximum heap size rounded up to 512 MB
> WARN  10:52:18,106 memtable_cleanup_threshold has been deprecated and
> should be removed from cassandra.yaml
> Cleaning up...
> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# df -h
> /dev/mapper/data_vg-cass_lv      7.0T  6.6T  432G  94%
> /var/lib/cassandra/data/disk1
> [user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l *txn*
> ls: cannot access *txn*: No such file or directory
> [user@server808 disk1]# du . -h|grep T
> 3.8T    ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
> 3.8T    ./kt_ks
> 3.8T    .
>
>
> lsof|grep deleted output contains many rows like this  :
>
> LocalPool 38608 40310 cassandra  879r      REG              253,3
> 283651305837  4297002780
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
> (deleted)
> LocalPool 38608 40310 cassandra 1061r      REG              253,3
> 361124597166  4297002778
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
> (deleted)
> OptionalT 38608 40352 cassandra  879r      REG              253,3
> 283651305837  4297002780
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
> (deleted)
> OptionalT 38608 40352 cassandra 1061r      REG              253,3
> 361124597166  4297002778
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
> (deleted
> logback-4 38608 40439 cassandra 1061r      REG              253,3
> 361124597166  4297002778
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
> (deleted)
>
>
> I will add that this cluster contains a single table with twcs.
>
> gil
>
>
> On Tue, Apr 4, 2023 at 10:14 AM Gil Ganz <gi...@gmail.com> wrote:
>
>> Hey
>> I have a 4.0.7 cluster in which I see weird behavior.
>> I expect that once compaction finishes, the old sstables that were part
>> of the compaction set will be deleted, but it appears they are deleted much
>> later, thus causing space issues.
>>
>> For example this is what I have in the log, only one compaction finished,
>> which took as input 3 sstables, but I can see many sstables were deleted
>> afterwards:
>>
>>
>> INFO  [CompactionExecutor:17344] 2023-04-04 04:39:07,345
>> CompactionTask.java:241 - Compacted (489ddc80-d282-11ed-b4fb-d9d812cda140)
>> 3 sstables to
>> [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5685-big,]
>> to level=0.  140.672GiB to 134.554GiB (~95% of original) in 13,894,920ms.
>> Read Throughput = 10.367MiB/s, Write Throughput = 9.916MiB/s, Row
>> Throughput = ~-80,450/s.  1,425,146,188 total partitions merged to
>> 1,378,482,110.  Partition merge counts were {1:1339149089, 2:32001964,
>> 3:7331057, }
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:08,151 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5546-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:09,399 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5424-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:10,563 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5677-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,143 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-3360-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,754 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5604-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,328 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5572-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,340 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5684-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,363 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5630-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:14,492 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4991-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:16,338 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-2537-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:18,287 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4171-big
>> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:21,467 SSTable.java:111 -
>> Deleting sstable:
>> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-1767-big
>>
>>
>> Am I missing something about how/when sstables are deleted? Is there
>> something I can do to control when that delete happens?
>>
>> gil
>>
>

Re: When are sstables that were compacted deleted?

Posted by Gil Ganz <gi...@gmail.com>.
More information - from another node in the cluster

I can see many txn files although I only have two compactions running.
[user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l *txn*
-rw-r--r-- 1 cassandra cassandra 613 Apr  4 05:26
nb_txn_compaction_09e3aa40-d2a7-11ed-b76b-3b279f6334bc.log
-rw-r--r-- 1 cassandra cassandra 461 Apr  4 10:17
nb_txn_compaction_11433360-d265-11ed-b76b-3b279f6334bc.log
-rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:48
nb_txn_compaction_593e5320-d265-11ed-b76b-3b279f6334bc.log
-rw-r--r-- 1 cassandra cassandra 614 Apr  3 22:47
nb_txn_compaction_701d62d0-d264-11ed-b76b-3b279f6334bc.log
-rw-r--r-- 1 cassandra cassandra 136 Apr  3 22:27
nb_txn_compaction_bb770b50-d26e-11ed-b76b-3b279f6334bc.log
-rw-r--r-- 1 cassandra cassandra 463 Apr  4 09:23
nb_txn_compaction_ce51bfe0-d264-11ed-b76b-3b279f6334bc.log
-rw-r--r-- 1 cassandra cassandra 134 Apr  4 10:31
nb_txn_compaction_d17c7380-d2d3-11ed-b76b-3b279f6334bc.log
-rw-r--r-- 1 cassandra cassandra 464 Apr  4 09:24
nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
-rw-r--r-- 1 cassandra cassandra 613 Apr  3 22:54
nb_txn_compaction_f456f3b0-d271-11ed-b76b-3b279f6334bc.log

Let's take for example the one from "Apr  4 09:24"
I can see the matching log message in system.log

INFO  [CompactionExecutor:142085] 2023-04-04 09:24:29,892
CompactionTask.java:241 - Compacted (ed7fc650-d264-11ed-b76b-3b279f6334bc)
2 sstables to
[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big,]
to level=0.  362.987GiB to 336.323GiB (~92% of original) in 43,625,742ms.
Read Throughput = 8.520MiB/s, Write Throughput = 7.894MiB/s, Row Throughput
= ~-11,482/s.  3,755,353,838 total partitions merged to 3,479,484,261.
Partition merge counts were {1:3203614684, 2:275869577, }


[user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# cat
nb_txn_compaction_ed7fc650-d264-11ed-b76b-3b279f6334bc.log
ADD:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-,0,8][2633027732]
REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-,1680543675071,8][4190572643]
REMOVE:[/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12109-big-,1680554352704,8][3101929253]
COMMIT:[,0,0][2613697770]

I would expect sstable 10334 to be gone, since compaction finished an hour
ago, but files are still there.


[user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-*
-rw-r--r-- 1 cassandra cassandra    315582480 Mar 24 16:46
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra 361124597166 Mar 24 16:46
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
-rw-r--r-- 1 cassandra cassandra           10 Mar 24 16:46
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra   4316334488 Mar 24 16:46
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Filter.db
-rw-r--r-- 1 cassandra cassandra 283651305837 Mar 24 16:46
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
-rw-r--r-- 1 cassandra cassandra        11934 Mar 24 16:46
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra    763353245 Apr  3 17:41
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Summary.db
-rw-r--r-- 1 cassandra cassandra           92 Mar 24 16:46
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-TOC.txt
[user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-*
-rw-r--r-- 1 cassandra cassandra    315582480 Apr  4 09:24
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra 361124597166 Apr  4 09:24
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Data.db
-rw-r--r-- 1 cassandra cassandra           10 Apr  4 09:24
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra   4316334488 Apr  4 09:24
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Filter.db
-rw-r--r-- 1 cassandra cassandra 283651305837 Apr  4 09:24
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Index.db
-rw-r--r-- 1 cassandra cassandra        11934 Apr  4 09:24
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra    709470839 Apr  4 09:41
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-Summary.db
-rw-r--r-- 1 cassandra cassandra           92 Apr  4 09:24
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-12142-big-TOC.txt


Disk space  and du command shows the following :

[user@server808 disk1]# df -h
/dev/mapper/data_vg-cass_lv      7.0T  6.6T  411G  95%
/var/lib/cassandra/data/disk1
[user@server808 disk1]# du . -h|grep T
6.6T    ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
6.6T    ./kt_ks
6.6T    .


If i run sstableutil to clean things up, df command still shows space as
occupied, but now du shows reduction, and files appear in lsof output

[user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# sstableutil -c
-v kt_ks new_table
Warning: Maximum heap size rounded up to 512 MB
WARN  10:52:18,106 memtable_cleanup_threshold has been deprecated and
should be removed from cassandra.yaml
Cleaning up...
[user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# df -h
/dev/mapper/data_vg-cass_lv      7.0T  6.6T  432G  94%
/var/lib/cassandra/data/disk1
[user@server808 new_table-44263b406bf111ed8bd9df80ace3677a]# ls -l *txn*
ls: cannot access *txn*: No such file or directory
[user@server808 disk1]# du . -h|grep T
3.8T    ./kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a
3.8T    ./kt_ks
3.8T    .


lsof|grep deleted output contains many rows like this  :

LocalPool 38608 40310 cassandra  879r      REG              253,3
283651305837  4297002780
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
(deleted)
LocalPool 38608 40310 cassandra 1061r      REG              253,3
361124597166  4297002778
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
(deleted)
OptionalT 38608 40352 cassandra  879r      REG              253,3
283651305837  4297002780
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Index.db
(deleted)
OptionalT 38608 40352 cassandra 1061r      REG              253,3
361124597166  4297002778
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
(deleted
logback-4 38608 40439 cassandra 1061r      REG              253,3
361124597166  4297002778
/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-10334-big-Data.db
(deleted)


I will add that this cluster contains a single table with twcs.

gil


On Tue, Apr 4, 2023 at 10:14 AM Gil Ganz <gi...@gmail.com> wrote:

> Hey
> I have a 4.0.7 cluster in which I see weird behavior.
> I expect that once compaction finishes, the old sstables that were part of
> the compaction set will be deleted, but it appears they are deleted much
> later, thus causing space issues.
>
> For example this is what I have in the log, only one compaction finished,
> which took as input 3 sstables, but I can see many sstables were deleted
> afterwards:
>
>
> INFO  [CompactionExecutor:17344] 2023-04-04 04:39:07,345
> CompactionTask.java:241 - Compacted (489ddc80-d282-11ed-b4fb-d9d812cda140)
> 3 sstables to
> [/var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5685-big,]
> to level=0.  140.672GiB to 134.554GiB (~95% of original) in 13,894,920ms.
> Read Throughput = 10.367MiB/s, Write Throughput = 9.916MiB/s, Row
> Throughput = ~-80,450/s.  1,425,146,188 total partitions merged to
> 1,378,482,110.  Partition merge counts were {1:1339149089, 2:32001964,
> 3:7331057, }
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:08,151 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5546-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:09,399 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5424-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:10,563 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5677-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,143 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-3360-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:12,754 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5604-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,328 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5572-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,340 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5684-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:13,363 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-5630-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:14,492 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4991-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:16,338 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-2537-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:18,287 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-4171-big
> INFO  [NonPeriodicTasks:1] 2023-04-04 04:39:21,467 SSTable.java:111 -
> Deleting sstable:
> /var/lib/cassandra/data/disk1/kt_ks/new_table-44263b406bf111ed8bd9df80ace3677a/nb-1767-big
>
>
> Am I missing something about how/when sstables are deleted? Is there
> something I can do to control when that delete happens?
>
> gil
>