You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by manish khandelwal <ma...@gmail.com> on 2021/07/05 16:24:46 UTC

How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

In one of our LCS table auto compaction was disabled. Now after years of
run, range queries using spark-cassandra-connector are failing. Cassandra
version is 2.1.16.

I suspect due to disabling of autocompaction lots of tombstones got
created. And now while reading those are creating issues and queries are
getting timed out. Am I right in my thinking? What is the possible way to
get out of this?

I thought of using major compaction but for LCS that was introduced in
Cassandra 2.2. Also user defined compactions dont work on LCS tables.



Regards

Manish Khandelwal

Re: How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

Posted by manish khandelwal <ma...@gmail.com>.
Thanks Jeff and  Vytenis.

Jeff, could you explain what do you mean by

If you just pipe all of your sstables to user defined compaction jmx
> endpoint one at a time you’ll purge many of the tombstones as long as you
> don’t have a horrific data model.


Regards
Manish

On Wed, Jul 7, 2021 at 4:21 AM Jeff Jirsa <jj...@gmail.com> wrote:

> In 2.1 the only option is enable auto compaction or queue up manual user
> defined compaction
>
> If you just pipe all of your sstables to user defined compaction jmx
> endpoint one at a time you’ll purge many of the tombstones as long as you
> don’t have a horrific data model.
>
>
>
> On Jul 6, 2021, at 3:03 PM, vytenis silgalis <vs...@gmail.com> wrote:
>
> 
> You might want to take a look at `unchecked_tombstone_compaction` table
> setting. The best way to see if this is affecting you is to look at the
> sstablemetadata for the sstables and see if your tombstone ratio is higher
> than the configured tombstone_threshold ratio (0.2 be default) for the
> table.
>
> For example the sstable has a tombstone_threshold of 0.2 but you see
> sstables OLDER than 10 days (LCS has a tombstone compaction interval of 10
> days, it won't run a tombstone compaction until a sstable is at least 10
> days old).
>
> > sstablemetadata example-ka-1233-Data.db | grep droppable
> Estimated droppable tombstones: 1.0
> ^ this is an extreme example but anything greater than .2 on a 10+ day
> sstable is a problem.
>
> By default the unchecked_tombstone_compaction setting is false which will
> lead to tombstones staying around if a partition spans multiple sstables
> (which may happen with LCS over a long period).
>
> Try setting `unchecked_tombstone_compaction` to true, note: that when you
> first run this IF any sstables are above the tombstone_ratio setting for
> that table they will be compacted, this may cause extra load on the cluster.
>
> Vytenis
> ... always do your own research and verify what people say. :)
>
> On Mon, Jul 5, 2021 at 10:11 PM manish khandelwal <
> manishkhandelwal03@gmail.com> wrote:
>
>> Thanks Kane for the suggestion.
>>
>> Regards
>> Manish
>>
>> On Tue, Jul 6, 2021 at 6:19 AM Kane Wilson <k...@raft.so> wrote:
>>
>>>
>>> In one of our LCS table auto compaction was disabled. Now after years of
>>>> run, range queries using spark-cassandra-connector are failing. Cassandra
>>>> version is 2.1.16.
>>>>
>>>> I suspect due to disabling of autocompaction lots of tombstones got
>>>> created. And now while reading those are creating issues and queries are
>>>> getting timed out. Am I right in my thinking? What is the possible way to
>>>> get out of this?
>>>>
>>>> I thought of using major compaction but for LCS that was introduced in
>>>> Cassandra 2.2. Also user defined compactions dont work on LCS tables.
>>>>
>>>>
>>>>
>>>> Regards
>>>>
>>>> Manish Khandelwal
>>>>
>>>
>>> If it's tombstones specifically you'll be able to see errors in the logs
>>> regarding passing the tombstone limit. However, disabling compactions could
>>> cause lots of problems (especially over years). I wouldn't be surprised if
>>> your reads are slow purely because of the number of SSTables you're hitting
>>> on each read. Given you've been running without compactions for so long you
>>> might want to look at just switching to STCS and re-enabling compactions.
>>> Note this should be done with care, as it could cause performance/storage
>>> issues.
>>>
>>> Cheers,
>>> Kane
>>>
>>> --
>>> raft.so - Cassandra consulting, support, and managed services
>>>
>>

Re: How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

Posted by Jeff Jirsa <jj...@gmail.com>.
In 2.1 the only option is enable auto compaction or queue up manual user defined compaction 

If you just pipe all of your sstables to user defined compaction jmx endpoint one at a time you’ll purge many of the tombstones as long as you don’t have a horrific data model.



> On Jul 6, 2021, at 3:03 PM, vytenis silgalis <vs...@gmail.com> wrote:
> 
> 
> You might want to take a look at `unchecked_tombstone_compaction` table setting. The best way to see if this is affecting you is to look at the sstablemetadata for the sstables and see if your tombstone ratio is higher than the configured tombstone_threshold ratio (0.2 be default) for the table.
> 
> For example the sstable has a tombstone_threshold of 0.2 but you see sstables OLDER than 10 days (LCS has a tombstone compaction interval of 10 days, it won't run a tombstone compaction until a sstable is at least 10 days old).
> 
> > sstablemetadata example-ka-1233-Data.db | grep droppable
> Estimated droppable tombstones: 1.0
> ^ this is an extreme example but anything greater than .2 on a 10+ day sstable is a problem.
> 
> By default the unchecked_tombstone_compaction setting is false which will lead to tombstones staying around if a partition spans multiple sstables (which may happen with LCS over a long period).
> 
> Try setting `unchecked_tombstone_compaction` to true, note: that when you first run this IF any sstables are above the tombstone_ratio setting for that table they will be compacted, this may cause extra load on the cluster.
> 
> Vytenis
> ... always do your own research and verify what people say. :)
> 
>> On Mon, Jul 5, 2021 at 10:11 PM manish khandelwal <ma...@gmail.com> wrote:
>> Thanks Kane for the suggestion.
>> 
>> Regards
>> Manish
>> 
>>> On Tue, Jul 6, 2021 at 6:19 AM Kane Wilson <k...@raft.so> wrote:
>>> 
>>>> In one of our LCS table auto compaction was disabled. Now after years of run, range queries using spark-cassandra-connector are failing. Cassandra version is 2.1.16.
>>>> 
>>>> I suspect due to disabling of autocompaction lots of tombstones got created. And now while reading those are creating issues and queries are getting timed out. Am I right in my thinking? What is the possible way to get out of this?
>>>> 
>>>> I thought of using major compaction but for LCS that was introduced in Cassandra 2.2. Also user defined compactions dont work on LCS tables.
>>>> 
>>>> 
>>>> Regards
>>>> Manish Khandelwal
>>> 
>>> 
>>> If it's tombstones specifically you'll be able to see errors in the logs regarding passing the tombstone limit. However, disabling compactions could cause lots of problems (especially over years). I wouldn't be surprised if your reads are slow purely because of the number of SSTables you're hitting on each read. Given you've been running without compactions for so long you might want to look at just switching to STCS and re-enabling compactions. Note this should be done with care, as it could cause performance/storage issues.
>>> 
>>> Cheers,
>>> Kane
>>> 
>>> -- 
>>> raft.so - Cassandra consulting, support, and managed services

Re: How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

Posted by vytenis silgalis <vs...@gmail.com>.
You might want to take a look at `unchecked_tombstone_compaction` table
setting. The best way to see if this is affecting you is to look at the
sstablemetadata for the sstables and see if your tombstone ratio is higher
than the configured tombstone_threshold ratio (0.2 be default) for the
table.

For example the sstable has a tombstone_threshold of 0.2 but you see
sstables OLDER than 10 days (LCS has a tombstone compaction interval of 10
days, it won't run a tombstone compaction until a sstable is at least 10
days old).

> sstablemetadata example-ka-1233-Data.db | grep droppable
Estimated droppable tombstones: 1.0
^ this is an extreme example but anything greater than .2 on a 10+ day
sstable is a problem.

By default the unchecked_tombstone_compaction setting is false which will
lead to tombstones staying around if a partition spans multiple sstables
(which may happen with LCS over a long period).

Try setting `unchecked_tombstone_compaction` to true, note: that when you
first run this IF any sstables are above the tombstone_ratio setting for
that table they will be compacted, this may cause extra load on the cluster.

Vytenis
... always do your own research and verify what people say. :)

On Mon, Jul 5, 2021 at 10:11 PM manish khandelwal <
manishkhandelwal03@gmail.com> wrote:

> Thanks Kane for the suggestion.
>
> Regards
> Manish
>
> On Tue, Jul 6, 2021 at 6:19 AM Kane Wilson <k...@raft.so> wrote:
>
>>
>> In one of our LCS table auto compaction was disabled. Now after years of
>>> run, range queries using spark-cassandra-connector are failing. Cassandra
>>> version is 2.1.16.
>>>
>>> I suspect due to disabling of autocompaction lots of tombstones got
>>> created. And now while reading those are creating issues and queries are
>>> getting timed out. Am I right in my thinking? What is the possible way to
>>> get out of this?
>>>
>>> I thought of using major compaction but for LCS that was introduced in
>>> Cassandra 2.2. Also user defined compactions dont work on LCS tables.
>>>
>>>
>>>
>>> Regards
>>>
>>> Manish Khandelwal
>>>
>>
>> If it's tombstones specifically you'll be able to see errors in the logs
>> regarding passing the tombstone limit. However, disabling compactions could
>> cause lots of problems (especially over years). I wouldn't be surprised if
>> your reads are slow purely because of the number of SSTables you're hitting
>> on each read. Given you've been running without compactions for so long you
>> might want to look at just switching to STCS and re-enabling compactions.
>> Note this should be done with care, as it could cause performance/storage
>> issues.
>>
>> Cheers,
>> Kane
>>
>> --
>> raft.so - Cassandra consulting, support, and managed services
>>
>

Re: How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

Posted by manish khandelwal <ma...@gmail.com>.
Thanks Kane for the suggestion.

Regards
Manish

On Tue, Jul 6, 2021 at 6:19 AM Kane Wilson <k...@raft.so> wrote:

>
> In one of our LCS table auto compaction was disabled. Now after years of
>> run, range queries using spark-cassandra-connector are failing. Cassandra
>> version is 2.1.16.
>>
>> I suspect due to disabling of autocompaction lots of tombstones got
>> created. And now while reading those are creating issues and queries are
>> getting timed out. Am I right in my thinking? What is the possible way to
>> get out of this?
>>
>> I thought of using major compaction but for LCS that was introduced in
>> Cassandra 2.2. Also user defined compactions dont work on LCS tables.
>>
>>
>>
>> Regards
>>
>> Manish Khandelwal
>>
>
> If it's tombstones specifically you'll be able to see errors in the logs
> regarding passing the tombstone limit. However, disabling compactions could
> cause lots of problems (especially over years). I wouldn't be surprised if
> your reads are slow purely because of the number of SSTables you're hitting
> on each read. Given you've been running without compactions for so long you
> might want to look at just switching to STCS and re-enabling compactions.
> Note this should be done with care, as it could cause performance/storage
> issues.
>
> Cheers,
> Kane
>
> --
> raft.so - Cassandra consulting, support, and managed services
>

Re: How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

Posted by Kane Wilson <k...@raft.so>.
In one of our LCS table auto compaction was disabled. Now after years of
> run, range queries using spark-cassandra-connector are failing. Cassandra
> version is 2.1.16.
>
> I suspect due to disabling of autocompaction lots of tombstones got
> created. And now while reading those are creating issues and queries are
> getting timed out. Am I right in my thinking? What is the possible way to
> get out of this?
>
> I thought of using major compaction but for LCS that was introduced in
> Cassandra 2.2. Also user defined compactions dont work on LCS tables.
>
>
>
> Regards
>
> Manish Khandelwal
>

If it's tombstones specifically you'll be able to see errors in the logs
regarding passing the tombstone limit. However, disabling compactions could
cause lots of problems (especially over years). I wouldn't be surprised if
your reads are slow purely because of the number of SSTables you're hitting
on each read. Given you've been running without compactions for so long you
might want to look at just switching to STCS and re-enabling compactions.
Note this should be done with care, as it could cause performance/storage
issues.

Cheers,
Kane

-- 
raft.so - Cassandra consulting, support, and managed services