You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Curt Allred <cu...@mediosystems.com> on 2012/05/25 20:21:24 UTC

RE: will compaction delete empty rows after all columns expired?

This is an old thread from December 27, 2011.  I interpret the "yes" answer to mean you do not have to explicitly delete an empty row after all of its columns have been deleted, the empty row (i.e. row key) will automatically be deleted eventually (after gc_grace).  Is that true?   I am not seeing that behavior on our v 0.7.9 ring.  We are accumulating a large number of old empty rows.  They are taking alot of space because the row keys are big, and exploding the data size by 10x.  I have read conflicting information on blogs and cassandra docs.  Someone mentioned that there are both row tombstones and column tombstones, implying that you have to explicitly delete empty rows.  Is that correct.

My basic question is... how do I delete all these empty row keys?

-----------------------------
From: Feng Qu
Sent: Tuesday, December 27, 2011 11:09 AM
Compaction should delete empty rows once gc_grace_seconds is passed, right? 
-----------------------------
From: Peter Schuller
Yes.  
But just to be extra clear: Data will not actually be removed once the row in question participates in compaction. Compactions will not be actively triggered by Cassandra for tombstone processing reasons.

Re: will compaction delete empty rows after all columns expired?

Posted by aaron morton <aa...@thelastpickle.com>.
> You can set the gc_grace_secs as a little value and force major compaction after the row is expired.  After then please check whether the row still exists.
There are some downsides to major compactions. (There have been some recent discussions).

You can provoke (some) minor compactions by:
* setting the min_compaction_threshold to 2 (not sure if nodetool in 0.7 supports this, you may need to make a schema change)
* using nodetool flush
 
If you have some larger sstables that do not get compacted try the userDefinedCompaction() method on the CompactionManager MBean via JMX (i may have gotten the names wrong there in 0.7). 

> So if I understand... the empty row will only be removed after gc_grace if enough compactions have occurred so that all the column tombstones for the empty row are in a single SSTable file?
We need to know that all the fragments of the row are contain in all of the sstables in the compaction task. They don't have to be in the same SSTable. 

You need tombstones to stop columns written previously from appearing in the results. If we purge the tombstone and a previous column value is in another sstable the delete will be undone. 

If you cannot compact the tombstones away let us know. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 31/05/2012, at 2:16 PM, Zhu Han wrote:

> On Thu, May 31, 2012 at 9:31 AM, Curt Allred <cu...@mediosystems.com> wrote:
> No, these were not wide rows.  They are rows that formerly had one or 2 columns. The columns are deleted but the empty rows dont go away, even after gc_grace_secs.
> 
> 
> The empty row goes away only during a compaction after the gc_grace_secs.
> 
> You can set the gc_grace_secs as a little value and force major compaction after the row is expired.  After then please check whether the row still exists.
>  
> 
>  
> 
> So if I understand... the empty row will only be removed after gc_grace if enough compactions have occurred so that all the column tombstones for the empty row are in a single SSTable file?
> 
> 
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> 
> 
>  
> 
> Minor compaction will remove the tombstones if the row only exists in the sstable being compaction. 
> 
>  
> 
> Are these very wide rows that are constantly written to ? 
> 
>  
> 
> Cheers
> 
>  p.s. cassandra 1.0 really does rock. 
> 
> 


Re: will compaction delete empty rows after all columns expired?

Posted by Zhu Han <sc...@gmail.com>.
On Thu, May 31, 2012 at 9:31 AM, Curt Allred <cu...@mediosystems.com> wrote:

> No, these were not wide rows.  They are rows that formerly had one or 2
> columns. The columns are deleted but the empty rows dont go away, even
> after gc_grace_secs.
>

The empty row goes away only during a compaction after the gc_grace_secs.

You can set the gc_grace_secs as a little value and force major compaction
after the row is expired.  After then please check whether the row still
exists.


> ****
>
> ** **
>
> So if I understand... the empty row will only be removed after gc_grace if
> enough compactions have occurred so that all the column tombstones for the
> empty row are in a single SSTable file?****
>
> ****
>
> *From:* aaron morton [mailto:aaron@thelastpickle.com]
>
> ****
>
> ** **
>
> Minor compaction will remove the tombstones if the row only exists in the
> sstable being compaction. ****
>
> ** **
>
> Are these very wide rows that are constantly written to ? ****
>
> ** **
>
> Cheers****
>
>  p.s. cassandra 1.0 really does rock. ****
>

RE: will compaction delete empty rows after all columns expired?

Posted by Curt Allred <cu...@mediosystems.com>.
No, these were not wide rows.  They are rows that formerly had one or 2 columns. The columns are deleted but the empty rows dont go away, even after gc_grace_secs.

So if I understand... the empty row will only be removed after gc_grace if enough compactions have occurred so that all the column tombstones for the empty row are in a single SSTable file?
From: aaron morton [mailto:aaron@thelastpickle.com]


Minor compaction will remove the tombstones if the row only exists in the sstable being compaction.

Are these very wide rows that are constantly written to ?

Cheers
 p.s. cassandra 1.0 really does rock.

Re: will compaction delete empty rows after all columns expired?

Posted by aaron morton <aa...@thelastpickle.com>.
Minor compaction will remove the tombstones if the row only exists in the sstable being compaction. 

Are these very wide rows that are constantly written to ? 

Cheers
 p.s. cassandra 1.0 really does rock. 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 26/05/2012, at 6:21 AM, Curt Allred wrote:

> This is an old thread from December 27, 2011.  I interpret the "yes" answer to mean you do not have to explicitly delete an empty row after all of its columns have been deleted, the empty row (i.e. row key) will automatically be deleted eventually (after gc_grace).  Is that true?   I am not seeing that behavior on our v 0.7.9 ring.  We are accumulating a large number of old empty rows.  They are taking alot of space because the row keys are big, and exploding the data size by 10x.  I have read conflicting information on blogs and cassandra docs.  Someone mentioned that there are both row tombstones and column tombstones, implying that you have to explicitly delete empty rows.  Is that correct.
> 
> My basic question is... how do I delete all these empty row keys?
> 
> -----------------------------
> From: Feng Qu
> Sent: Tuesday, December 27, 2011 11:09 AM
> Compaction should delete empty rows once gc_grace_seconds is passed, right? 
> -----------------------------
> From: Peter Schuller
> Yes.  
> But just to be extra clear: Data will not actually be removed once the row in question participates in compaction. Compactions will not be actively triggered by Cassandra for tombstone processing reasons.


Re: will compaction delete empty rows after all columns expired?

Posted by Radim Kolar <hs...@filez.com>.
do not delete empty rows. It refreshes tombstone and they will never expire.