You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2009/12/08 19:46:18 UTC
[jira] Resolved: (CASSANDRA-604) Compactions might remove
tombstones without removing the actual data
[ https://issues.apache.org/jira/browse/CASSANDRA-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis resolved CASSANDRA-604.
--------------------------------------
Resolution: Fixed
committed. let us know if your testing uncovers any problems.
> Compactions might remove tombstones without removing the actual data
> --------------------------------------------------------------------
>
> Key: CASSANDRA-604
> URL: https://issues.apache.org/jira/browse/CASSANDRA-604
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Cent-OS
> Reporter: Ramzi Rabah
> Assignee: Jonathan Ellis
> Fix For: 0.5
>
> Attachments: 604.patch
>
>
> I was looking at the code for compaction, and noticed that when we are doing compactions during the normal course of
> Cassandra, we call:
> for (List<SSTableReader> sstables :
> getCompactionBuckets(ssTables_, 50L * 1024L * 1024L))
> {
> if (sstables.size() < minThreshold)
> {
> continue;
> }
> other wise docompactions...
> where getCompactionBuckets puts in buckets very small files, or files
> that are 0.5-1.5 of each other's sizes. It will only compact those if
> they are >= minimum threshold which is 4 by default.
> So far so good. Now how about this scenario, I have an old entry that
> I inserted long time ago and that was compacted into a 75MB file.
> There are fewer 75MB files than 4. I do many deletes, and I end with 4
> extra sstable files filled with tombstones, each about 300 MB large.
> These 4 files are compacted together and in the compaction code, if
> the tombstone is there we don't copy it over to the new file. Now
> since we did not compact the 75MB files, but we compacted the
> tombstone files, that leaves us with the tombstone gone, but
> the data still intact in the 75MB file. If we compacted all the
> files together I don't think that would be a problem, but since we
> only compact 4, this potentially leaves data not cleaned.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.