You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Pau Rodriguez (JIRA)" <ji...@apache.org> on 2011/06/20 11:44:47 UTC

[jira] [Created] (CASSANDRA-2795) Autodelete empty rows

Autodelete empty rows
---------------------

                 Key: CASSANDRA-2795
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2795
             Project: Cassandra
          Issue Type: Improvement
          Components: Core, Tools
    Affects Versions: 0.8.0
            Reporter: Pau Rodriguez


In a system where every column expire using TTL. The rows persist, and they are empty.
If is possible to also delete them if empty when last column had expired.

I understand that this may be difficult to synchronize between all the cluster.

If this behavior isn't good for all cases, maybe can be configured in a variable per Column Family.

Alternatively could be a tool to removed empty rows along all the cluster, the problem to do that using the API is the time between the check is done and the remove is send.

I think that is preferable to be done when last column has expired.

Thanks in advance.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (CASSANDRA-2795) Autodelete empty rows

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne resolved CASSANDRA-2795.
-----------------------------------------

    Resolution: Not A Problem

What you are seeing is range ghosts: http://wiki.apache.org/cassandra/FAQ#range_ghosts

The row *is* correctly deleted when all columns expires. It won't show as a range ghost once gc_grace seconds have passed.

> Autodelete empty rows
> ---------------------
>
>                 Key: CASSANDRA-2795
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2795
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core, Tools
>    Affects Versions: 0.8.0
>            Reporter: Pau Rodriguez
>
> In a system where every column expire using TTL. The rows persist, and they are empty.
> If is possible to also delete them if empty when last column had expired.
> I understand that this may be difficult to synchronize between all the cluster.
> If this behavior isn't good for all cases, maybe can be configured in a variable per Column Family.
> Alternatively could be a tool to removed empty rows along all the cluster, the problem to do that using the API is the time between the check is done and the remove is send.
> I think that is preferable to be done when last column has expired.
> Thanks in advance.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2795) Autodelete empty rows

Posted by "Pau Rodriguez (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051910#comment-13051910 ] 

Pau Rodriguez commented on CASSANDRA-2795:
------------------------------------------

*For any future reader:*
Here I'm trying to optimize cassandra's cache by reducing the life of the ghosts rows.
This is not to remove ghost from the results! For that read the FAQ about range_ghosts:
http://wiki.apache.org/cassandra/FAQ#range_ghosts

----

Ok, it worked testing as you described.

So if I want short lived ghosts rows, what parameters I need to configure?
I suppose I need to configure the gc_grace, the flush timer and the compactation timer.
Where this options are present?

Thanks.

> Autodelete empty rows
> ---------------------
>
>                 Key: CASSANDRA-2795
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2795
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core, Tools
>    Affects Versions: 0.8.0
>            Reporter: Pau Rodriguez
>
> In a system where every column expire using TTL. The rows persist, and they are empty.
> If is possible to also delete them if empty when last column had expired.
> I understand that this may be difficult to synchronize between all the cluster.
> If this behavior isn't good for all cases, maybe can be configured in a variable per Column Family.
> Alternatively could be a tool to removed empty rows along all the cluster, the problem to do that using the API is the time between the check is done and the remove is send.
> I think that is preferable to be done when last column has expired.
> Thanks in advance.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2795) Autodelete empty rows

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051909#comment-13051909 ] 

Sylvain Lebresne commented on CASSANDRA-2795:
---------------------------------------------

bq. I tested setting gc_grace very low (tried 0 and 1) in a single node, and the row didn't disappear.

Ok, to be precise you need to have a compaction occuring after gc_grace has passed. So you'll need to flush after the insertion, wait for the column to expire, force a compaction, wait for it to finish and then request.

> Autodelete empty rows
> ---------------------
>
>                 Key: CASSANDRA-2795
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2795
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core, Tools
>    Affects Versions: 0.8.0
>            Reporter: Pau Rodriguez
>
> In a system where every column expire using TTL. The rows persist, and they are empty.
> If is possible to also delete them if empty when last column had expired.
> I understand that this may be difficult to synchronize between all the cluster.
> If this behavior isn't good for all cases, maybe can be configured in a variable per Column Family.
> Alternatively could be a tool to removed empty rows along all the cluster, the problem to do that using the API is the time between the check is done and the remove is send.
> I think that is preferable to be done when last column has expired.
> Thanks in advance.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2795) Autodelete empty rows

Posted by "Pau Rodriguez (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051894#comment-13051894 ] 

Pau Rodriguez commented on CASSANDRA-2795:
------------------------------------------

That's what I first read.
I tested setting gc_grace very low (tried 0 and 1) in a single node, and the row didn't disappear.
And in the same scenario, if I send via cassandra-cli the delete command, the row disappear instantaneously. 

> Autodelete empty rows
> ---------------------
>
>                 Key: CASSANDRA-2795
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2795
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core, Tools
>    Affects Versions: 0.8.0
>            Reporter: Pau Rodriguez
>
> In a system where every column expire using TTL. The rows persist, and they are empty.
> If is possible to also delete them if empty when last column had expired.
> I understand that this may be difficult to synchronize between all the cluster.
> If this behavior isn't good for all cases, maybe can be configured in a variable per Column Family.
> Alternatively could be a tool to removed empty rows along all the cluster, the problem to do that using the API is the time between the check is done and the remove is send.
> I think that is preferable to be done when last column has expired.
> Thanks in advance.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2795) Autodelete empty rows

Posted by "Pau Rodriguez (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051924#comment-13051924 ] 

Pau Rodriguez commented on CASSANDRA-2795:
------------------------------------------

In IRC {{ntelford}} tells me not to worry about this topic, ghosts rows size is too low to worry about (little more than the size of the key). And reducing the compaction/flush timers will likely hurt performance more than it'll improve it.

So I close this issue.

Sorry for the inconveniences.

> Autodelete empty rows
> ---------------------
>
>                 Key: CASSANDRA-2795
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2795
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core, Tools
>    Affects Versions: 0.8.0
>            Reporter: Pau Rodriguez
>
> In a system where every column expire using TTL. The rows persist, and they are empty.
> If is possible to also delete them if empty when last column had expired.
> I understand that this may be difficult to synchronize between all the cluster.
> If this behavior isn't good for all cases, maybe can be configured in a variable per Column Family.
> Alternatively could be a tool to removed empty rows along all the cluster, the problem to do that using the API is the time between the check is done and the remove is send.
> I think that is preferable to be done when last column has expired.
> Thanks in advance.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira