You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Brian Spindler <br...@gmail.com> on 2018/01/20 14:33:32 UTC

need to reclaim space with TWCS

Hi, I have several column families using TWCS and it’s great.
Unfortunately we seem to have missed the great advice in Alex’s article
here: http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html about
setting the appropriate aggressive tombstone settings and now we have lots
of timestamp overlaps and disk space to reclaim.



I am trying to figure the best way out of this. Lots of the SSTables with
overlapping timestamps in newer SSTables have droppable tombstones at like
0.895143957 or something similar, very close to 0.90 where the full sstable
will drop afaik.



I’m thinking to do the following immediately:



Set *unchecked_tombstone_compaction = true*

Set* tombstone_compaction_interval == TTL + gc_grace_seconds*

Set* dclocal_read_repair_chance = 0.0 (currently 0.1)*



If I do this, can I expect TWCS/C* to reclaim the space from those SSTables
with 0.89* droppable tombstones?   Or do I (can I?) manually delete these
files and will c* just ignore the overlapping data and treat as tombstoned?




What else should/could be done?



Thank you in advance for your advice,



*__________________________________________________*

*Brian Spindler *

Re: need to reclaim space with TWCS

Posted by br...@gmail.com.
Got it.  Thanks again. 

> On Jan 20, 2018, at 11:17 AM, Alexander Dejanovski <al...@thelastpickle.com> wrote:
> 
> I would turn background read repair off on the table to improve the overlap issue, but you'll still have foreground read repair if you use quorum reads anyway.
> 
> So put dclocal_... to 0.0.
> 
> The commit you're referring to has been merged in 3.11.1 as 2.1 doesn't patched anymore.
> 
> 
>> Le sam. 20 janv. 2018 à 16:55, Brian Spindler <br...@gmail.com> a écrit :
>> Hi Alexander, after re-reading this https://issues.apache.org/jira/browse/CASSANDRA-13418 it seems you would recommend leaving dclocal_read_repair at maybe 10%  is that true?  
>> 
>> Also, has this been patched to 2.1?  https://github.com/thelastpickle/cassandra/commit/58440e707cd6490847a37dc8d76c150d3eb27aab#diff-e8e282423dcbf34d30a3578c8dec15cdR176 
>> 
>> Cheers, 
>> 
>> -B
>> 
>> 
>>> On Sat, Jan 20, 2018 at 10:49 AM Brian Spindler <br...@gmail.com> wrote:
>>> Hi Alexander,  Thanks for your response!  I'll give it a shot.    
>>> 
>>>> On Sat, Jan 20, 2018 at 10:22 AM Alexander Dejanovski <al...@thelastpickle.com> wrote:
>>>> Hi Brian,
>>>> 
>>>> You should definitely set unchecked_tombstone_compaction to true and set the interval to the default of 1 day. Use a tombstone_threshold of 0.6 for example and see how that works.
>>>> Tombstones will get purged depending on your partitioning as their partition needs to be fully contained within a single sstable.
>>>> 
>>>> Deleting the sstables by hand is theoretically possible but should be kept as a last resort option if you're running out of space.
>>>> 
>>>> Cheers,
>>>> 
>>>> 
>>>>> Le sam. 20 janv. 2018 à 15:41, Brian Spindler <br...@gmail.com> a écrit :
>>>>> I probably should have mentioned our setup: we’re on Cassandra version 2.1.15.
>>>>> 
>>>>> 
>>>>>> On Sat, Jan 20, 2018 at 9:33 AM Brian Spindler <br...@gmail.com> wrote:
>>>>>> Hi, I have several column families using TWCS and it’s great.  Unfortunately we seem to have missed the great advice in Alex’s article here: http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html about setting the appropriate aggressive tombstone settings and now we have lots of timestamp overlaps and disk space to reclaim. 
>>>>>>  
>>>>>> I am trying to figure the best way out of this. Lots of the SSTables with overlapping timestamps in newer SSTables have droppable tombstones at like 0.895143957 or something similar, very close to 0.90 where the full sstable will drop afaik.  
>>>>>>  
>>>>>> I’m thinking to do the following immediately:
>>>>>>  
>>>>>> Set unchecked_tombstone_compaction = true
>>>>>> Set tombstone_compaction_interval == TTL + gc_grace_seconds
>>>>>> Set dclocal_read_repair_chance = 0.0 (currently 0.1)
>>>>>>  
>>>>>> If I do this, can I expect TWCS/C* to reclaim the space from those SSTables with 0.89* droppable tombstones?   Or do I (can I?) manually delete these files and will c* just ignore the overlapping data and treat as tombstoned?  
>>>>>>  
>>>>>> What else should/could be done? 
>>>>>>  
>>>>>> Thank you in advance for your advice,
>>>>>>  
>>>>>> __________________________________________________
>>>>>> Brian Spindler 
>>>>>>  
>>>>>>  
>>>> 
>>>> -- 
>>>> -----------------
>>>> Alexander Dejanovski
>>>> France
>>>> @alexanderdeja
>>>> 
>>>> Consultant
>>>> Apache Cassandra Consulting
>>>> http://www.thelastpickle.com
> 
> -- 
> -----------------
> Alexander Dejanovski
> France
> @alexanderdeja
> 
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com

Re: need to reclaim space with TWCS

Posted by Alexander Dejanovski <al...@thelastpickle.com>.
I would turn background read repair off on the table to improve the overlap
issue, but you'll still have foreground read repair if you use quorum reads
anyway.

So put dclocal_... to 0.0.

The commit you're referring to has been merged in 3.11.1 as 2.1 doesn't
patched anymore.

Le sam. 20 janv. 2018 à 16:55, Brian Spindler <br...@gmail.com> a
écrit :

> Hi Alexander, after re-reading this
> https://issues.apache.org/jira/browse/CASSANDRA-13418 it seems you would
> recommend leaving dclocal_read_repair at maybe 10%  is that true?
>
> Also, has this been patched to 2.1?
> https://github.com/thelastpickle/cassandra/commit/58440e707cd6490847a37dc8d76c150d3eb27aab#diff-e8e282423dcbf34d30a3578c8dec15cdR176
>
>
> Cheers,
>
> -B
>
>
> On Sat, Jan 20, 2018 at 10:49 AM Brian Spindler <br...@gmail.com>
> wrote:
>
>> Hi Alexander,  Thanks for your response!  I'll give it a shot.
>>
>> On Sat, Jan 20, 2018 at 10:22 AM Alexander Dejanovski <
>> alex@thelastpickle.com> wrote:
>>
>>> Hi Brian,
>>>
>>> You should definitely set unchecked_tombstone_compaction to true and set
>>> the interval to the default of 1 day. Use a tombstone_threshold of 0.6 for
>>> example and see how that works.
>>> Tombstones will get purged depending on your partitioning as their
>>> partition needs to be fully contained within a single sstable.
>>>
>>> Deleting the sstables by hand is theoretically possible but should be
>>> kept as a last resort option if you're running out of space.
>>>
>>> Cheers,
>>>
>>> Le sam. 20 janv. 2018 à 15:41, Brian Spindler <br...@gmail.com>
>>> a écrit :
>>>
>>>> I probably should have mentioned our setup: we’re on Cassandra version
>>>> 2.1.15.
>>>>
>>>>
>>>> On Sat, Jan 20, 2018 at 9:33 AM Brian Spindler <
>>>> brian.spindler@gmail.com> wrote:
>>>>
>>>>> Hi, I have several column families using TWCS and it’s great.
>>>>> Unfortunately we seem to have missed the great advice in Alex’s article
>>>>> here: http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html about
>>>>> setting the appropriate aggressive tombstone settings and now we have lots
>>>>> of timestamp overlaps and disk space to reclaim.
>>>>>
>>>>>
>>>>>
>>>>> I am trying to figure the best way out of this. Lots of the SSTables
>>>>> with overlapping timestamps in newer SSTables have droppable tombstones at
>>>>> like 0.895143957 or something similar, very close to 0.90 where the full
>>>>> sstable will drop afaik.
>>>>>
>>>>>
>>>>>
>>>>> I’m thinking to do the following immediately:
>>>>>
>>>>>
>>>>>
>>>>> Set *unchecked_tombstone_compaction = true*
>>>>>
>>>>> Set* tombstone_compaction_interval == TTL + gc_grace_seconds*
>>>>>
>>>>> Set* dclocal_read_repair_chance = 0.0 (currently 0.1)*
>>>>>
>>>>>
>>>>>
>>>>> If I do this, can I expect TWCS/C* to reclaim the space from those
>>>>> SSTables with 0.89* droppable tombstones?   Or do I (can I?) manually
>>>>> delete these files and will c* just ignore the overlapping data and treat
>>>>> as tombstoned?
>>>>>
>>>>>
>>>>>
>>>>> What else should/could be done?
>>>>>
>>>>>
>>>>>
>>>>> Thank you in advance for your advice,
>>>>>
>>>>>
>>>>>
>>>>> *__________________________________________________*
>>>>>
>>>>> *Brian Spindler *
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> --
>>> -----------------
>>> Alexander Dejanovski
>>> France
>>> @alexanderdeja
>>>
>>> Consultant
>>> Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>>>
>> --
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: need to reclaim space with TWCS

Posted by Brian Spindler <br...@gmail.com>.
Hi Alexander, after re-reading this
https://issues.apache.org/jira/browse/CASSANDRA-13418 it seems you would
recommend leaving dclocal_read_repair at maybe 10%  is that true?

Also, has this been patched to 2.1?
https://github.com/thelastpickle/cassandra/commit/58440e707cd6490847a37dc8d76c150d3eb27aab#diff-e8e282423dcbf34d30a3578c8dec15cdR176


Cheers,

-B


On Sat, Jan 20, 2018 at 10:49 AM Brian Spindler <br...@gmail.com>
wrote:

> Hi Alexander,  Thanks for your response!  I'll give it a shot.
>
> On Sat, Jan 20, 2018 at 10:22 AM Alexander Dejanovski <
> alex@thelastpickle.com> wrote:
>
>> Hi Brian,
>>
>> You should definitely set unchecked_tombstone_compaction to true and set
>> the interval to the default of 1 day. Use a tombstone_threshold of 0.6 for
>> example and see how that works.
>> Tombstones will get purged depending on your partitioning as their
>> partition needs to be fully contained within a single sstable.
>>
>> Deleting the sstables by hand is theoretically possible but should be
>> kept as a last resort option if you're running out of space.
>>
>> Cheers,
>>
>> Le sam. 20 janv. 2018 à 15:41, Brian Spindler <br...@gmail.com>
>> a écrit :
>>
>>> I probably should have mentioned our setup: we’re on Cassandra version
>>> 2.1.15.
>>>
>>>
>>> On Sat, Jan 20, 2018 at 9:33 AM Brian Spindler <br...@gmail.com>
>>> wrote:
>>>
>>>> Hi, I have several column families using TWCS and it’s great.
>>>> Unfortunately we seem to have missed the great advice in Alex’s article
>>>> here: http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html about
>>>> setting the appropriate aggressive tombstone settings and now we have lots
>>>> of timestamp overlaps and disk space to reclaim.
>>>>
>>>>
>>>>
>>>> I am trying to figure the best way out of this. Lots of the SSTables
>>>> with overlapping timestamps in newer SSTables have droppable tombstones at
>>>> like 0.895143957 or something similar, very close to 0.90 where the full
>>>> sstable will drop afaik.
>>>>
>>>>
>>>>
>>>> I’m thinking to do the following immediately:
>>>>
>>>>
>>>>
>>>> Set *unchecked_tombstone_compaction = true*
>>>>
>>>> Set* tombstone_compaction_interval == TTL + gc_grace_seconds*
>>>>
>>>> Set* dclocal_read_repair_chance = 0.0 (currently 0.1)*
>>>>
>>>>
>>>>
>>>> If I do this, can I expect TWCS/C* to reclaim the space from those
>>>> SSTables with 0.89* droppable tombstones?   Or do I (can I?) manually
>>>> delete these files and will c* just ignore the overlapping data and treat
>>>> as tombstoned?
>>>>
>>>>
>>>>
>>>> What else should/could be done?
>>>>
>>>>
>>>>
>>>> Thank you in advance for your advice,
>>>>
>>>>
>>>>
>>>> *__________________________________________________*
>>>>
>>>> *Brian Spindler *
>>>>
>>>>
>>>>
>>>>
>>>>
>>> --
>> -----------------
>> Alexander Dejanovski
>> France
>> @alexanderdeja
>>
>> Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>

Re: need to reclaim space with TWCS

Posted by Brian Spindler <br...@gmail.com>.
Hi Alexander,  Thanks for your response!  I'll give it a shot.

On Sat, Jan 20, 2018 at 10:22 AM Alexander Dejanovski <
alex@thelastpickle.com> wrote:

> Hi Brian,
>
> You should definitely set unchecked_tombstone_compaction to true and set
> the interval to the default of 1 day. Use a tombstone_threshold of 0.6 for
> example and see how that works.
> Tombstones will get purged depending on your partitioning as their
> partition needs to be fully contained within a single sstable.
>
> Deleting the sstables by hand is theoretically possible but should be kept
> as a last resort option if you're running out of space.
>
> Cheers,
>
> Le sam. 20 janv. 2018 à 15:41, Brian Spindler <br...@gmail.com>
> a écrit :
>
>> I probably should have mentioned our setup: we’re on Cassandra version
>> 2.1.15.
>>
>>
>> On Sat, Jan 20, 2018 at 9:33 AM Brian Spindler <br...@gmail.com>
>> wrote:
>>
>>> Hi, I have several column families using TWCS and it’s great.
>>> Unfortunately we seem to have missed the great advice in Alex’s article
>>> here: http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html about
>>> setting the appropriate aggressive tombstone settings and now we have lots
>>> of timestamp overlaps and disk space to reclaim.
>>>
>>>
>>>
>>> I am trying to figure the best way out of this. Lots of the SSTables
>>> with overlapping timestamps in newer SSTables have droppable tombstones at
>>> like 0.895143957 or something similar, very close to 0.90 where the full
>>> sstable will drop afaik.
>>>
>>>
>>>
>>> I’m thinking to do the following immediately:
>>>
>>>
>>>
>>> Set *unchecked_tombstone_compaction = true*
>>>
>>> Set* tombstone_compaction_interval == TTL + gc_grace_seconds*
>>>
>>> Set* dclocal_read_repair_chance = 0.0 (currently 0.1)*
>>>
>>>
>>>
>>> If I do this, can I expect TWCS/C* to reclaim the space from those
>>> SSTables with 0.89* droppable tombstones?   Or do I (can I?) manually
>>> delete these files and will c* just ignore the overlapping data and treat
>>> as tombstoned?
>>>
>>>
>>>
>>> What else should/could be done?
>>>
>>>
>>>
>>> Thank you in advance for your advice,
>>>
>>>
>>>
>>> *__________________________________________________*
>>>
>>> *Brian Spindler *
>>>
>>>
>>>
>>>
>>>
>> --
> -----------------
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

Re: need to reclaim space with TWCS

Posted by Alexander Dejanovski <al...@thelastpickle.com>.
Hi Brian,

You should definitely set unchecked_tombstone_compaction to true and set
the interval to the default of 1 day. Use a tombstone_threshold of 0.6 for
example and see how that works.
Tombstones will get purged depending on your partitioning as their
partition needs to be fully contained within a single sstable.

Deleting the sstables by hand is theoretically possible but should be kept
as a last resort option if you're running out of space.

Cheers,

Le sam. 20 janv. 2018 à 15:41, Brian Spindler <br...@gmail.com> a
écrit :

> I probably should have mentioned our setup: we’re on Cassandra version
> 2.1.15.
>
>
> On Sat, Jan 20, 2018 at 9:33 AM Brian Spindler <br...@gmail.com>
> wrote:
>
>> Hi, I have several column families using TWCS and it’s great.
>> Unfortunately we seem to have missed the great advice in Alex’s article
>> here: http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html about
>> setting the appropriate aggressive tombstone settings and now we have lots
>> of timestamp overlaps and disk space to reclaim.
>>
>>
>>
>> I am trying to figure the best way out of this. Lots of the SSTables with
>> overlapping timestamps in newer SSTables have droppable tombstones at like
>> 0.895143957 or something similar, very close to 0.90 where the full sstable
>> will drop afaik.
>>
>>
>>
>> I’m thinking to do the following immediately:
>>
>>
>>
>> Set *unchecked_tombstone_compaction = true*
>>
>> Set* tombstone_compaction_interval == TTL + gc_grace_seconds*
>>
>> Set* dclocal_read_repair_chance = 0.0 (currently 0.1)*
>>
>>
>>
>> If I do this, can I expect TWCS/C* to reclaim the space from those
>> SSTables with 0.89* droppable tombstones?   Or do I (can I?) manually
>> delete these files and will c* just ignore the overlapping data and treat
>> as tombstoned?
>>
>>
>>
>> What else should/could be done?
>>
>>
>>
>> Thank you in advance for your advice,
>>
>>
>>
>> *__________________________________________________*
>>
>> *Brian Spindler *
>>
>>
>>
>>
>>
> --
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: need to reclaim space with TWCS

Posted by Brian Spindler <br...@gmail.com>.
I probably should have mentioned our setup: we’re on Cassandra version
2.1.15.


On Sat, Jan 20, 2018 at 9:33 AM Brian Spindler <br...@gmail.com>
wrote:

> Hi, I have several column families using TWCS and it’s great.
> Unfortunately we seem to have missed the great advice in Alex’s article
> here: http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html about
> setting the appropriate aggressive tombstone settings and now we have lots
> of timestamp overlaps and disk space to reclaim.
>
>
>
> I am trying to figure the best way out of this. Lots of the SSTables with
> overlapping timestamps in newer SSTables have droppable tombstones at like
> 0.895143957 or something similar, very close to 0.90 where the full sstable
> will drop afaik.
>
>
>
> I’m thinking to do the following immediately:
>
>
>
> Set *unchecked_tombstone_compaction = true*
>
> Set* tombstone_compaction_interval == TTL + gc_grace_seconds*
>
> Set* dclocal_read_repair_chance = 0.0 (currently 0.1)*
>
>
>
> If I do this, can I expect TWCS/C* to reclaim the space from those
> SSTables with 0.89* droppable tombstones?   Or do I (can I?) manually
> delete these files and will c* just ignore the overlapping data and treat
> as tombstoned?
>
>
>
> What else should/could be done?
>
>
>
> Thank you in advance for your advice,
>
>
>
> *__________________________________________________*
>
> *Brian Spindler *
>
>
>
>
>