You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefania (JIRA)" <ji...@apache.org> on 2015/09/01 05:30:45 UTC
[jira] [Commented] (CASSANDRA-10222) Periodically attempt to delete failed snapshot deletions on Windows

    [ https://issues.apache.org/jira/browse/CASSANDRA-10222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724729#comment-14724729 ] 

Stefania commented on CASSANDRA-10222:
--------------------------------------

Code review:

* The {{SnapshotDeletingTask}} constructor assumes that the delete snapshot task has failed but {{Directories.clearSnapshot()}} doesn't attempt to delete it. It simply creates {{SnapshotDeletingTask}} and calls {{run}}. Perhaps you changed approach and forgot to adjust the constructor or vice-versa?

* The class name in the logger for {{SnapshotDeletingTask}} is still {{SSTableDeletingTask}}. 

Notes:

* {{SSTableDeletingTask}} has been removed in 3.0, it's equivalent is {{TransactionLog.SSTableTidier}}. I've put more details on CASSANDRA-8271.

bq. we could be more aggressive and call SnapshotDeletingTask.rescheduleFailedTasks() after each successful SSTableDeletingTask.run() to capture sstable deletions that never trigger the failed path rather than relying on a GC to hit 

Perhaps we could schedule only the failed snapshot deletions where the snapshot parent directory is the same as the sstable directory?


> Periodically attempt to delete failed snapshot deletions on Windows
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-10222
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10222
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Joshua McKenzie
>            Assignee: Joshua McKenzie
>              Labels: Windows
>             Fix For: 2.2.2
>
>
> The changes in CASSANDRA-9658 leave us in a position where a node on Windows will have to be restarted to clear out snapshots that cannot be deleted at request time due to sstables still being mapped, thus preventing deletions of hard links. A simple periodic task to categorize failed snapshot deletions and retry them would help prevent node disk utilization from growing unbounded by snapshots as compaction will eventually make these snapshot files deletable.
> Given that hard links to files in NTFS don't take up any extra space on disk so long as the original file still exists, the only limitation for users from this approach will be the inability to 'move' a snapshot file to another drive share. They will be copyable, however, so it's a minor platform difference.
> This goes directly against the goals of CASSANDRA-8271 and will likely be built on top of that code. Until such time as we get buffered performance in-line with memory-mapped, this is an interim necessity for production roll-outs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)