You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefan Miklosovic (Jira)" <ji...@apache.org> on 2021/09/03 10:50:00 UTC

[jira] [Commented] (CASSANDRA-16860) Add --older-than option to nodetool clearsnapshot

    [ https://issues.apache.org/jira/browse/CASSANDRA-16860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17409437#comment-17409437 ] 

Stefan Miklosovic commented on CASSANDRA-16860:
-----------------------------------------------

I think that when we introduce this to clearsnapshot command, we actually need two flags, you might use

{source}

--older-than=1d

{source}

This means "remove all snaphots older than 1 day"

The second one would be

{source}

--older-than-timestamp=unixtimestamp

{source}

This would, obviously, clear everything older than that.

There is a distinction between these two, if I want to remove all snaphots I took last hour, I do not want to compute timestamp for that. On the other hand, if I know from when I want to remove it exactly, I do not want to compute "how far ago it was".

Internally, --older-than would translate to --older-than-timestamp by taking current system time on client and substracting the period so we will go with timestamp only to server.

> Add --older-than option to nodetool clearsnapshot
> -------------------------------------------------
>
>                 Key: CASSANDRA-16860
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16860
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Tool/nodetool
>            Reporter: Jack Casey
>            Assignee: Jack Casey
>            Priority: Normal
>             Fix For: 4.x
>
>
> h1. Summary
> Opening this issue in reference to [this WIP PR|https://github.com/apache/cassandra/pull/1148]:
> This functionality allows users of Cassandra to remove snapshots ad-hoc, based on a TTL. This is to address the problem of snapshots accumulating. For example, an organization I work for aims to keep snapshots for 30 days, however we don't have any way to easily clean them after those 30 days are up.
> This is similar to the goals set in: https://issues.apache.org/jira/browse/CASSANDRA-16451 however would be available for Cassandra 3.x.
> h1. Functionality
> This adds a new command to NodeTool, called {{expiresnapshot}} with the following options:
> NAME
>  nodetool expiresnapshots - Removes snapshots that are older than a TTL
>  in days
> SYNOPSIS
>  nodetool [(-h <host> | --host <host>)] [(-p <port> | --port <port>)]
>  [(-pw <password> | --password <password>)]
>  [(-pwf <passwordFilePath> | --password-file <passwordFilePath>)]
>  [(-u <username> | --username <username>)] expiresnapshots [--dry-run]
>  (-t <ttl> | --ttl <ttl>)
> OPTIONS
>  --dry-run
>  Run without actually clearing snapshots
> -h <host>, --host <host>
>  Node hostname or ip address
> -p <port>, --port <port>
>  Remote jmx agent port number
> -pw <password>, --password <password>
>  Remote jmx agent password
> -pwf <passwordFilePath>, --password-file <passwordFilePath>
>  Path to the JMX password file
> -t <ttl>, --ttl <ttl>
>  TTL (in days) to expire snapshots
> -u <username>, --username <username>
>  Remote jmx agent username
> The snapshot date is taken by converting the default snapshot name timestamps (epoch time in miliseconds). For this reason, snapshot names that don't contain a timestamp in this format will not be cleared.
> h1. Example Use
> This Cassandra environment has a number of snapshots, a few are recent, and a few outdated:
> root@cassandra001:/cassandra# nodetool listsnapshots
>  Snapshot Details:
>  Snapshot name Keyspace name Column family name True size Size on disk
>  1529173922063 users_keyspace users 362.03 KiB 362.89 KiB
>  1629173909461 users_keyspace users 362.03 KiB 362.89 KiB
>  1629173922063 users_keyspace users 362.03 KiB 362.89 KiB
>  1599173922063 users_keyspace users 362.03 KiB 362.89 KiB
>  1629173916816 users_keyspace users 362.03 KiB 362.89 KiB
> Total TrueDiskSpaceUsed: 1.77 MiB
> To validate the removal runs as expected, we can use the `--dry-run` option:
> root@cassandra001:/cassandra# nodetool expiresnapshots --ttl 30 --dry-run
>  Starting simulated cleanup of snapshots older than 30 days
>  Clearing (dry run): 1529173922063
>  Clearing (dry run): 1599173922063
>  Cleared (dry run): 2 snapshots
> Now that we are confident the correct snapshots will be removed, we can omit the {{--dry-run}} flag:
> root@cassandra001:/cassandra# nodetool expiresnapshots --ttl 30
>  Starting cleanup of snapshots older than 30 days
>  Clearing: 1529173922063
>  Clearing: 1599173922063
>  Cleared: 2 snapshots
> To confirm our changes are successful, we list the snapshots that still remain:
> root@cassandra001:/cassandra# nodetool listsnapshots
>  Snapshot Details:
>  Snapshot name Keyspace name Column family name True size Size on disk
>  1629173909461 users_keyspace users 362.03 KiB 362.89 KiB
>  1629173922063 users_keyspace users 362.03 KiB 362.89 KiB
>  1629173916816 users_keyspace users 362.03 KiB 362.89 KiB
> Total TrueDiskSpaceUsed: 1.06 MiB
> h1. Next Steps
> To be completed:
>  - Tests
>  - Documentation updates
> I am a new to this repository, and am fuzzy on a few details even after reading the contribution guide 😅 Any advice on the following would be greatly appreciated!
>  - What branch would this type of change be merged into? Currently, I'm targeting {{apache:trunk}} by default
>  - Is there a test strategy/pattern for this type of change? I was not able to find any existing tests for similar {{nodetool}} commands
> Thanks! 😄



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org