You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Ruchir Jha <ru...@gmail.com> on 2014/05/08 15:24:42 UTC

Re: clearing tombstones?

I tried to do this, however the doubling in disk space is not "temporary"
as you state in your note. What am I missing?


On Fri, Apr 11, 2014 at 10:44 AM, William Oberman
<ob...@civicscience.com>wrote:

> So, if I was impatient and just "wanted to make this happen now", I could:
>
> 1.) Change GCGraceSeconds of the CF to 0
> 2.) run nodetool compact (*)
> 3.) Change GCGraceSeconds of the CF back to 10 days
>
> Since I have ~900M tombstones, even if I miss a few due to impatience, I
> don't care *that* much as I could re-run my clean up tool against the now
> much smaller CF.
>
> (*) A long long time ago I seem to recall reading advice about "don't ever
> run nodetool compact", but I can't remember why.  Is there any bad long
> term consequence?  Short term there are several:
> -a heavy operation
> -temporary 2x disk space
> -one big SSTable afterwards
> But moving forward, everything is ok right?  CommitLog/MemTable->SStables,
> minor compactions that merge SSTables, etc...  The only flaw I can think of
> is it will take forever until the SSTable minor compactions build up enough
> to consider including the big SSTable in a compaction, making it likely
> I'll have to self manage compactions.
>
>
>
> On Fri, Apr 11, 2014 at 10:31 AM, Mark Reddy <ma...@boxever.com>wrote:
>
>> Correct, a tombstone will only be removed after gc_grace period has
>> elapsed. The default value is set to 10 days which allows a great deal of
>> time for consistency to be achieved prior to deletion. If you are
>> operationally confident that you can achieve consistency via anti-entropy
>> repairs within a shorter period you can always reduce that 10 day interval.
>>
>>
>> Mark
>>
>>
>> On Fri, Apr 11, 2014 at 3:16 PM, William Oberman <
>> oberman@civicscience.com> wrote:
>>
>>> I'm seeing a lot of articles about a dependency between removing
>>> tombstones and GCGraceSeconds, which might be my problem (I just checked,
>>> and this CF has GCGraceSeconds of 10 days).
>>>
>>>
>>> On Fri, Apr 11, 2014 at 10:10 AM, tommaso barbugli <tb...@gmail.com>wrote:
>>>
>>>> compaction should take care of it; for me it never worked so I run
>>>> nodetool compaction on every node; that does it.
>>>>
>>>>
>>>> 2014-04-11 16:05 GMT+02:00 William Oberman <ob...@civicscience.com>:
>>>>
>>>> I'm wondering what will clear tombstoned rows?  nodetool cleanup,
>>>>> nodetool repair, or time (as in just wait)?
>>>>>
>>>>> I had a CF that was more or less storing session information.  After
>>>>> some time, we decided that one piece of this information was pointless to
>>>>> track (and was 90%+ of the columns, and in 99% of those cases was ALL
>>>>> columns for a row).   I wrote a process to remove all of those columns
>>>>> (which again in a vast majority of cases had the effect of removing the
>>>>> whole row).
>>>>>
>>>>> This CF had ~1 billion rows, so I expect to be left with ~100m rows.
>>>>>  After I did this mass delete, everything was the same size on disk (which
>>>>> I expected, knowing how tombstoning works).  It wasn't 100% clear to me
>>>>> what to poke to cause compactions to clear the tombstones.  First I tried
>>>>> nodetool cleanup on a candidate node.  But, afterwards the disk usage was
>>>>> the same.  Then I tried nodetool repair on that same node.  But again, disk
>>>>> usage is still the same.  The CF has no snapshots.
>>>>>
>>>>> So, am I misunderstanding something?  Is there another operation to
>>>>> try?  Do I have to "just wait"?  I've only done cleanup/repair on one node.
>>>>>  Do I have to run one or the other over all nodes to clear tombstones?
>>>>>
>>>>> Cassandra 1.2.15 if it matters,
>>>>>
>>>>> Thanks!
>>>>>
>>>>> will
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>

Re: clearing tombstones?

Posted by William Oberman <ob...@civicscience.com>.

Not an expert, just a user of cassandra. For me, "before" was a cf with a
set of files (I forget the official naming system, so I'll make up my own):
A0
A1
...
AN

"During":
A0
A1
...
AN
B0

Where B0 is the union of Ai. Due to tombstones, mutations, etc.  B0 is "at
most" 2x, but also probably close to 2x (unless you are all tombstones,
like me).

"After"
B0

Since cassandra can clean up Ai. Not sure when this happens.

Not sure what state you are in above. Sounds like between "during" and
"after".

Will

On Thursday, May 8, 2014, Ruchir Jha <ru...@gmail.com> wrote:

> I tried to do this, however the doubling in disk space is not "temporary"
> as you state in your note. What am I missing?
>
>
> On Fri, Apr 11, 2014 at 10:44 AM, William Oberman <
> oberman@civicscience.com<javascript:_e(%7B%7D,'cvml','oberman@civicscience.com');>
> > wrote:
>
> So, if I was impatient and just "wanted to make this happen now", I could:
>
> 1.) Change GCGraceSeconds of the CF to 0
> 2.) run nodetool compact (*)
> 3.) Change GCGraceSeconds of the CF back to 10 days
>
> Since I have ~900M tombstones, even if I miss a few due to impatience, I
> don't care *that* much as I could re-run my clean up tool against the now
> much smaller CF.
>
> (*) A long long time ago I seem to recall reading advice about "don't ever
> run nodetool compact", but I can't remember why.  Is there any bad long
> term consequence?  Short term there are several:
> -a heavy operation
> -temporary 2x disk space
> -one big SSTable afterwards
> But moving forward, everything is ok right?  CommitLog/MemTable->SStables,
> minor compactions that merge SSTables, etc...  The only flaw I can think of
> is it will take forever until the SSTable minor compactions build up enough
> to consider including the big SSTable in a compaction, making it likely
> I'll have to self manage compactions.
>
>
>
> On Fri, Apr 11, 2014 at 10:31 AM, Mark Reddy <ma...@boxever.com>wrote:
>
> Correct, a tombstone will only be removed after gc_grace period has
> elapsed. The default value is set to 10 days which allows a great deal of
> time for consistency to be achieved prior to deletion. If you are
> operationally confident that you can achieve consistency via anti-entropy
> repairs within a shorter period you can always reduce that 10 day interval.
>
>
> Mark
>
>
> On Fri, Apr 11, 2014 at 3:16 PM, William Oberman <oberman@civicscience.com
> > wrote:
>
> I'm seeing a lot of articles about a dependency between removing
> tombstones and GCGraceSeconds, which might be my problem (I just checked,
> and this CF has GCGraceSeconds of 10 days).
>
>
> On Fri, Apr 11, 2014 at 10:10 AM, tommaso barbugli <tb...@gmail.com>wrote:
>
> compaction should take care of it; for me it never worked so I run
> nodetool compaction on every node; that does it.
>
>
> 2014-04-11 16:05 GMT+02:00 William Oberman <ob...@civicscience.com>:
>
> I'm wondering what will clear tombstoned rows?  nodetool cleanup, nodetool
> repair, or time (as in just wait)?
>
> I had a CF that was more or less storing session information.  After some
> time, we decided that one piece of this information was pointless to track
> (and was 90%+ of the columns, and in 99% of those cases was ALL columns for
> a row).   I wrote a process to remove all of those columns (which again in
> a vast majority of cases had the effect of removing the whole row).
>
> This CF had ~1 billion rows, so I expect to be left with ~100m rows.
>  After I did this mass delete, everything was the same size on disk (which
> I expected, knowing how tombstoning works).  It wasn't 100% clear to me
> what to poke to cause compactions to clear the tombstones.  First I tried
> nodetool cleanup on a candidate node.  But, afterwards the disk usage was
> the same.  Then I tried nodetool repair on that same node.  But again, disk
> usage is still the same.  The CF has no snapshots.
>
> So, am I misunderstanding something?  Is there another operation to try?
>  Do I have to "just wait"?  I've only done cleanup/re
>
>
>

-- 
Will Oberman
Civic Science, Inc.
6101 Penn Avenue, Fifth Floor
Pittsburgh, PA 15206
(M) 412-480-7835
(E) oberman@civicscience.com