You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Paulo Motta (Jira)" <ji...@apache.org> on 2022/02/07 23:54:00 UTC

[jira] [Commented] (CASSANDRA-17267) Snapshot true size is miscalculated

    [ https://issues.apache.org/jira/browse/CASSANDRA-17267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488499#comment-17488499 ] 

Paulo Motta commented on CASSANDRA-17267:
-----------------------------------------

Curiously this did not reproduce on 3.0, so I used the same approach of comparing the names of the snapshot files with the files present in the live set to skip accounting live sstables during snapshot true size calculation.

ccm repro after fix:
{noformat}
% ccm node1 nodetool -- snapshot -t test test_ks

Requested creating snapshot(s) for [test_ks] with snapshot name [test]
Snapshot directory: test

% ccm node1 nodetool tablestats test_ks.tbl | grep -i snapshot
    Space used by snapshots (total): 0

% ccm node1 nodetool listsnapshots

Snapshot Details:
Snapshot name Keyspace name Column family name True size Size on disk
test          test_ks       tbl                0 bytes   5.74 KB

Total TrueDiskSpaceUsed: 0 bytes

% ccm node1 nodetool compact test_ks tbl

% ccm node1 nodetool tablestats test_ks.tbl | grep -i snapshot
    Space used by snapshots (total): 5044

% ccm node1 nodetool listsnapshots

Snapshot Details:
Snapshot name Keyspace name Column family name True size Size on disk
test          test_ks       tbl                4.93 KB   5.74 KB

Total TrueDiskSpaceUsed: 4.93 KB
{noformat}

I will use the new approach of using only the directory structure to decide whether a snapshot file is present in the live set when decoupling snapshot size computation from {{ColumnFamilyStore}} on CASSANDRA-16843.

While working on this I noticed that secondary indexes are not included in the computation of the true size so I created CASSANDRA-17357 to address this separately.

3.11+ patches and CI below:

|[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...pauloricardomg:CASSANDRA-17267-3.11]|[tests|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1414/]|
|[4.0|https://github.com/apache/cassandra/compare/cassandra-4.0...pauloricardomg:CASSANDRA-17267-4.0]|[tests|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1415/]|
|[trunk|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:CASSANDRA-17267-trunk]|[tests|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1416/]|

> Snapshot true size is miscalculated
> -----------------------------------
>
>                 Key: CASSANDRA-17267
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17267
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Snapshots
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>            Priority: Normal
>
> As far as I understand, the snapshot "size on disk" is the total size of the snapshot, while the "true size" is the (size_on_disk - size_of_live_sstables).
> I created a snapshot on a 3.11 node without traffic and I expected the "true size" to be 0KB since the original sstables were still present, but this didn't seem to be the case:
> {noformat}
> $ nodetool listsnapshots
> Snapshot Details:
> Snapshot name Keyspace name Column family name True size Size on disk
> test          ks1           tbl1               4.86 KiB  5.69 KiB
> Total TrueDiskSpaceUsed: 4.86 KiB
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org