You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jordan West (JIRA)" <ji...@apache.org> on 2018/06/09 00:37:00 UTC

[jira] [Comment Edited] (CASSANDRA-14499) node-level disk quota

    [ https://issues.apache.org/jira/browse/CASSANDRA-14499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506746#comment-16506746 ] 

Jordan West edited comment on CASSANDRA-14499 at 6/9/18 12:36 AM:
------------------------------------------------------------------

The other reason the OS level wouldn't work is we are trying to track *live* data, which the OS can't tell the difference between. EDIT: also to clarify, the goal here isn't to implement a perfect quota. There will be some room for error where the quota can be exceeded. The goal is to the mark the node unhealthy when it reaches this level and to have enough headroom for compaction or other operations to get it to a healthy state. 

Regarding taking reads, [~jasobrown], [~krummas], and I discussed this some offline. Since the node can only get more and more out of sync while not taking write traffic and can't participate in (read) repair until the amount of storage used is below quota, we thought it better to disable both reads and writes. Less-blocking and speculative read repair makes us more available in this case (as it should).

Disabling gossip is a quick route to disabling reads/writes. Is it the best approach to doing so? I'm not 100%. My concern is for how the operator gets back to a healthy state once a quota is reached on a node. They have a few options: migrate data to a bigger node, compaction catches up and deletes data, quota is raised so its not met anymore, node(s) are added to take storage responsibility away from the node, or data is forcefully deleted from the node. We need to ensure we don't prevent those operations from taking place. I've been discussing this with [~jasobrown] offline as well. 


was (Author: jrwest):
The other reason the OS level wouldn't work is we are trying to track *live* data, which the OS can't tell the difference between.

Regarding taking reads, [~jasobrown], [~krummas], and I discussed this some offline. Since the node can only get more and more out of sync while not taking write traffic and can't participate in (read) repair until the amount of storage used is below quota, we thought it better to disable both reads and writes. Less-blocking and speculative read repair makes us more available in this case (as it should).

Disabling gossip is a quick route to disabling reads/writes. Is it the best approach to doing so? I'm not 100%. My concern is for how the operator gets back to a healthy state once a quota is reached on a node. They have a few options: migrate data to a bigger node, compaction catches up and deletes data, quota is raised so its not met anymore, node(s) are added to take storage responsibility away from the node, or data is forcefully deleted from the node. We need to ensure we don't prevent those operations from taking place. I've been discussing this with [~jasobrown] offline as well. 

> node-level disk quota
> ---------------------
>
>                 Key: CASSANDRA-14499
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jordan West
>            Assignee: Jordan West
>            Priority: Major
>
> Operators should be able to specify, via YAML, the amount of usable disk space on a node as a percentage of the total available or as an absolute value. If both are specified, the absolute value should take precedence. This allows operators to reserve space available to the database for background tasks -- primarily compaction. When a node reaches its quota, gossip should be disabled to prevent it taking further writes (which would increase the amount of data stored), being involved in reads (which are likely to be more inconsistent over time), or participating in repair (which may increase the amount of space used on the machine). The node re-enables gossip when the amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which reserves some amount of space on each drive that is not usable by the database.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org