You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "André Cruz (JIRA)" <ji...@apache.org> on 2013/01/10 19:50:12 UTC

[jira] [Updated] (CASSANDRA-5143) Safety valve on number of tombstones skipped on read path too prevent a full heap

     [ https://issues.apache.org/jira/browse/CASSANDRA-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

André Cruz updated CASSANDRA-5143:
----------------------------------

    Description: 
When doing a range query on a row with a lot of tombstones, these can quickly add up and use too much heap, even if we specify a column count of 2 as the tombstones can be between those two live columns. From the client API side it can do nothing to prevent this from happening since there is no limit that can be specified for the number of tombstones being collected.

I know that this looks like the "I'm using a row as a queue and building up a ton of tombstones" anti-pattern, but still Cassandra should be able to take better care of himself so as to prevent a DoS. I can imagine a lot of use cases that let users create and delete columns on a row.

I propose a simple safety valve that can act like this: "The client has asked me for X nodes, I've already collected X^Y nodes and still have not found X live nodes, I should just give up and return an exception". The Y would be the configurable parameter. Time taken per query or memory used could also be factors to take into consideration.

  was:
When doing a range query on a row with a lot of tombstones, these can quickly add up and use too much heap, even if we specify a column count of 2 as the tombstones can be between those two live columns. From the client API side it can do nothing to prevent this from happening since there is no limit that can be specified for the number of tombstones being collected.

I know that this looks like the "I'm using a row as a queue and building up a ton of tombstones" anti-pattern, but still Cassandra should be able to take better care of himself so as to prevent a DoS. I can imagine a lot of use cases that let users create and delete columns on a row.

I propose a simple safety valve that can act like this: "The client has asked me for X nodes, I've already collected X^Y nodes and still have not found X live nodes, I should just give up". The Y would be the configurable parameter. Time taken per query or memory used could also be factors to take into consideration.

    
> Safety valve on number of tombstones skipped on read path too prevent a full heap
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5143
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5143
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1.5
>         Environment: Debian Linux, 3 node cluster with RF 3, 8GB heap on 32GB machines
>            Reporter: André Cruz
>
> When doing a range query on a row with a lot of tombstones, these can quickly add up and use too much heap, even if we specify a column count of 2 as the tombstones can be between those two live columns. From the client API side it can do nothing to prevent this from happening since there is no limit that can be specified for the number of tombstones being collected.
> I know that this looks like the "I'm using a row as a queue and building up a ton of tombstones" anti-pattern, but still Cassandra should be able to take better care of himself so as to prevent a DoS. I can imagine a lot of use cases that let users create and delete columns on a row.
> I propose a simple safety valve that can act like this: "The client has asked me for X nodes, I've already collected X^Y nodes and still have not found X live nodes, I should just give up and return an exception". The Y would be the configurable parameter. Time taken per query or memory used could also be factors to take into consideration.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira