You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by manish khandelwal <ma...@gmail.com> on 2020/01/21 11:36:03 UTC

Cassandra2.0.14 : Obsolete files not being deleted after compaction

Hi Team

I am observing some obsolete files in Cassandra 2.0.14 which are already
compacted but not removed from the system after compaction.
As per CASSANDRA-7872 <https://issues.apache.org/jira/browse/CASSANDRA-7872> ,
after GC grace period has passed the sstables are open for read again and
can lead to data resurrection. I am facing disk crunch  (90% full ) as well
and need to remove those obsolete files ASAP.


To avoid this what should be our strategy? I am thinking on following lines
1. Stop the Cassandra server.
2. Remove the obsolete files manually.
3. Start the Cassandra server.

Regards
Manish

Re: Cassandra2.0.14 : Obsolete files not being deleted after compaction

Posted by Laxmikant Upadhyay <la...@gmail.com>.
Hi,
Just an update, We deleted obsolete sstables and it worked fine. However I
am not able to find out any jira for same issue.

On Wed, Jan 22, 2020 at 3:58 PM manish khandelwal <
manishkhandelwal03@gmail.com> wrote:

> Thanks Jeff.
>
> There was no restart between "Compacting" and "Compacted" logs but I
> observed that full repair (-pr) was running at that time with errors.
>
> *Caused by: java.lang.RuntimeException: java.io.IOException: Cannot
> proceed on repair because a neighbor (/aa.bb.cc.dd) is dead: session failed*
>
> Does anyone remember any JIRA ticket related to obsolete sstables not
> being deleted after compaction?
>
> Regards
> Manish
>
>
>
>
>
> On Wed, Jan 22, 2020 at 11:37 AM Jeff Jirsa <jj...@gmail.com> wrote:
>
>>
>>
>> On Tue, Jan 21, 2020 at 8:58 PM manish khandelwal <
>> manishkhandelwal03@gmail.com> wrote:
>>
>>> Thanks Nitan,
>>>
>>>  Thanks for your reply.
>>>
>>> I am using following methodology to find obsolete sstables and just want
>>> to make sure that I don't delete live data if I delete them .
>>>
>>> In the following logs I searched for sstable "
>>> keyspace-columnfamily-jb-456789" and found that this "*CompactionExecutor:1957"
>>> *thread compacted  keyspace-columnfamily-jb-123456-Data.db ,
>>> keyspace-columnfamily-jb-234567 -Data.db , keyspace-columnfamily-jb-
>>> 345678-Data.db. These files are still present in my data directory so I am
>>> assuming that they are obsolete. I*s my assumption correct*?
>>>
>>
>> The lines from 'Compacting' are the ones obsoleted IF and ONLY IF you see
>> a completed "Compacted" line for the same thread without a restart in
>> between.
>>
>>
>>>
>>> INFO [CompactionExecutor:1957] 2020-01-20 06:44:56,721
>>> CompactionTask.java (line 120) Compacting
>>> [SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
>>> *keyspace-columnfamily-jb-123456-Data.db*'),
>>> SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
>>> *keyspace-columnfamily-jb-234567-Data.db*'),
>>> SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
>>> *keyspace-columnfamily-jb-345678-Data.db*')]
>>>  INFO [CompactionExecutor:1957] 2020-01-20 12:45:23,270
>>> ColumnFamilyStore.java (line 795) Enqueuing flush of
>>> Memtable-compactions_in_progress@519967741(0/0 serialized/live bytes, 1
>>> ops)
>>>  INFO [*CompactionExecutor:1957*] 2020-01-20 12:45:23,502
>>> CompactionTask.java (line 296) Compacted 3 sstables to
>>> [/var/lib/cassandra/data/keyspace/columnfamily/
>>> *keyspace-columnfamily-jb-456789*,].  136,795,757,524 bytes to
>>> 100,529,812,389 (~73% of original) in 21,626,781ms = 4.433055MB/s.
>>>  1,738,999,743 total partitions merged to 1,274,232,528.  Partition merge
>>> counts were {1:1049583261, 2:309997005, 3:23140824, }
>>>
>>>
>> In this case,
>> /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-123456-*
>> , /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-234567-*,
>> and /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-345678-*
>> are all obsolete and should be gc'd "soon". If they're not being gc'd,
>> there's something wrong and you should figure out what's going on. The
>> cases where this happened in 2.0.x (which is what you're running) were
>> usually pretty nasty bugs, and consider this a reason why you should be
>> upgrading.
>>
>> Note that if you just `rm` those files, you'll probably throw
>> FileNotFound exceptions and break the node until you restart, which is bad.
>> You'd have to stop the host, confirm everything is shut down, then remove
>> that 137GB worth of input files if they still exist.
>>
>> Also, please upgrade to 2.1.20. Your life will probably be much easier
>> because of it.
>>
>> As with all things, these are personal opinions, I cant guarantee they're
>> safe, manually mucking around with database data files is scary, make sure
>> you have a backup, practice in a lab, etc.
>>
>>
>>> Regards
>>> Manish
>>>
>>>
>>> On Tue, Jan 21, 2020 at 9:09 PM Nitan Kainth <ni...@gmail.com>
>>> wrote:
>>>
>>>> If you are certain that you don’t need data, your plan is good. Make
>>>> sure to delete all the files for any given sequence number ie data, index,
>>>> toc etc
>>>>
>>>> Regards,
>>>>
>>>> Nitan
>>>>
>>>> Cell: 510 449 9629
>>>>
>>>> On Jan 21, 2020, at 5:36 AM, manish khandelwal <
>>>> manishkhandelwal03@gmail.com> wrote:
>>>>
>>>> 
>>>> Hi Team
>>>>
>>>> I am observing some obsolete files in Cassandra 2.0.14 which are
>>>> already compacted but not removed from the system after compaction.
>>>> As per CASSANDRA-7872
>>>> <https://issues.apache.org/jira/browse/CASSANDRA-7872> , after GC
>>>> grace period has passed the sstables are open for read again and can lead
>>>> to data resurrection. I am facing disk crunch  (90% full ) as well and need
>>>> to remove those obsolete files ASAP.
>>>>
>>>>
>>>> To avoid this what should be our strategy? I am thinking on following
>>>> lines
>>>> 1. Stop the Cassandra server.
>>>> 2. Remove the obsolete files manually.
>>>> 3. Start the Cassandra server.
>>>>
>>>> Regards
>>>> Manish
>>>>
>>>>
>>>>
>>>>
>>>>

-- 

regards,
Laxmikant Upadhyay

Re: Cassandra2.0.14 : Obsolete files not being deleted after compaction

Posted by manish khandelwal <ma...@gmail.com>.
Thanks Jeff.

There was no restart between "Compacting" and "Compacted" logs but I
observed that full repair (-pr) was running at that time with errors.

*Caused by: java.lang.RuntimeException: java.io.IOException: Cannot proceed
on repair because a neighbor (/aa.bb.cc.dd) is dead: session failed*

Does anyone remember any JIRA ticket related to obsolete sstables not being
deleted after compaction?

Regards
Manish





On Wed, Jan 22, 2020 at 11:37 AM Jeff Jirsa <jj...@gmail.com> wrote:

>
>
> On Tue, Jan 21, 2020 at 8:58 PM manish khandelwal <
> manishkhandelwal03@gmail.com> wrote:
>
>> Thanks Nitan,
>>
>>  Thanks for your reply.
>>
>> I am using following methodology to find obsolete sstables and just want
>> to make sure that I don't delete live data if I delete them .
>>
>> In the following logs I searched for sstable "
>> keyspace-columnfamily-jb-456789" and found that this "*CompactionExecutor:1957"
>> *thread compacted  keyspace-columnfamily-jb-123456-Data.db ,
>> keyspace-columnfamily-jb-234567 -Data.db , keyspace-columnfamily-jb-
>> 345678-Data.db. These files are still present in my data directory so I am
>> assuming that they are obsolete. I*s my assumption correct*?
>>
>
> The lines from 'Compacting' are the ones obsoleted IF and ONLY IF you see
> a completed "Compacted" line for the same thread without a restart in
> between.
>
>
>>
>> INFO [CompactionExecutor:1957] 2020-01-20 06:44:56,721
>> CompactionTask.java (line 120) Compacting
>> [SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
>> *keyspace-columnfamily-jb-123456-Data.db*'),
>> SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
>> *keyspace-columnfamily-jb-234567-Data.db*'),
>> SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
>> *keyspace-columnfamily-jb-345678-Data.db*')]
>>  INFO [CompactionExecutor:1957] 2020-01-20 12:45:23,270
>> ColumnFamilyStore.java (line 795) Enqueuing flush of
>> Memtable-compactions_in_progress@519967741(0/0 serialized/live bytes, 1
>> ops)
>>  INFO [*CompactionExecutor:1957*] 2020-01-20 12:45:23,502
>> CompactionTask.java (line 296) Compacted 3 sstables to
>> [/var/lib/cassandra/data/keyspace/columnfamily/
>> *keyspace-columnfamily-jb-456789*,].  136,795,757,524 bytes to
>> 100,529,812,389 (~73% of original) in 21,626,781ms = 4.433055MB/s.
>>  1,738,999,743 total partitions merged to 1,274,232,528.  Partition merge
>> counts were {1:1049583261, 2:309997005, 3:23140824, }
>>
>>
> In this case,
> /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-123456-*
> , /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-234567-*,
> and /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-345678-*
> are all obsolete and should be gc'd "soon". If they're not being gc'd,
> there's something wrong and you should figure out what's going on. The
> cases where this happened in 2.0.x (which is what you're running) were
> usually pretty nasty bugs, and consider this a reason why you should be
> upgrading.
>
> Note that if you just `rm` those files, you'll probably throw FileNotFound
> exceptions and break the node until you restart, which is bad. You'd have
> to stop the host, confirm everything is shut down, then remove that 137GB
> worth of input files if they still exist.
>
> Also, please upgrade to 2.1.20. Your life will probably be much easier
> because of it.
>
> As with all things, these are personal opinions, I cant guarantee they're
> safe, manually mucking around with database data files is scary, make sure
> you have a backup, practice in a lab, etc.
>
>
>> Regards
>> Manish
>>
>>
>> On Tue, Jan 21, 2020 at 9:09 PM Nitan Kainth <ni...@gmail.com>
>> wrote:
>>
>>> If you are certain that you don’t need data, your plan is good. Make
>>> sure to delete all the files for any given sequence number ie data, index,
>>> toc etc
>>>
>>> Regards,
>>>
>>> Nitan
>>>
>>> Cell: 510 449 9629
>>>
>>> On Jan 21, 2020, at 5:36 AM, manish khandelwal <
>>> manishkhandelwal03@gmail.com> wrote:
>>>
>>> 
>>> Hi Team
>>>
>>> I am observing some obsolete files in Cassandra 2.0.14 which are already
>>> compacted but not removed from the system after compaction.
>>> As per CASSANDRA-7872
>>> <https://issues.apache.org/jira/browse/CASSANDRA-7872> , after GC grace
>>> period has passed the sstables are open for read again and can lead to data
>>> resurrection. I am facing disk crunch  (90% full ) as well and need to
>>> remove those obsolete files ASAP.
>>>
>>>
>>> To avoid this what should be our strategy? I am thinking on following
>>> lines
>>> 1. Stop the Cassandra server.
>>> 2. Remove the obsolete files manually.
>>> 3. Start the Cassandra server.
>>>
>>> Regards
>>> Manish
>>>
>>>
>>>
>>>
>>>

Re: Cassandra2.0.14 : Obsolete files not being deleted after compaction

Posted by Jeff Jirsa <jj...@gmail.com>.
On Tue, Jan 21, 2020 at 8:58 PM manish khandelwal <
manishkhandelwal03@gmail.com> wrote:

> Thanks Nitan,
>
>  Thanks for your reply.
>
> I am using following methodology to find obsolete sstables and just want
> to make sure that I don't delete live data if I delete them .
>
> In the following logs I searched for sstable "
> keyspace-columnfamily-jb-456789" and found that this "*CompactionExecutor:1957"
> *thread compacted  keyspace-columnfamily-jb-123456-Data.db ,
> keyspace-columnfamily-jb-234567 -Data.db , keyspace-columnfamily-jb-
> 345678-Data.db. These files are still present in my data directory so I am
> assuming that they are obsolete. I*s my assumption correct*?
>

The lines from 'Compacting' are the ones obsoleted IF and ONLY IF you see a
completed "Compacted" line for the same thread without a restart in between.


>
> INFO [CompactionExecutor:1957] 2020-01-20 06:44:56,721 CompactionTask.java
> (line 120) Compacting
> [SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
> *keyspace-columnfamily-jb-123456-Data.db*'),
> SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
> *keyspace-columnfamily-jb-234567-Data.db*'),
> SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
> *keyspace-columnfamily-jb-345678-Data.db*')]
>  INFO [CompactionExecutor:1957] 2020-01-20 12:45:23,270
> ColumnFamilyStore.java (line 795) Enqueuing flush of
> Memtable-compactions_in_progress@519967741(0/0 serialized/live bytes, 1
> ops)
>  INFO [*CompactionExecutor:1957*] 2020-01-20 12:45:23,502
> CompactionTask.java (line 296) Compacted 3 sstables to
> [/var/lib/cassandra/data/keyspace/columnfamily/
> *keyspace-columnfamily-jb-456789*,].  136,795,757,524 bytes to
> 100,529,812,389 (~73% of original) in 21,626,781ms = 4.433055MB/s.
>  1,738,999,743 total partitions merged to 1,274,232,528.  Partition merge
> counts were {1:1049583261, 2:309997005, 3:23140824, }
>
>
In this case,
/var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-123456-*
, /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-234567-*,
and /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-345678-*
are all obsolete and should be gc'd "soon". If they're not being gc'd,
there's something wrong and you should figure out what's going on. The
cases where this happened in 2.0.x (which is what you're running) were
usually pretty nasty bugs, and consider this a reason why you should be
upgrading.

Note that if you just `rm` those files, you'll probably throw FileNotFound
exceptions and break the node until you restart, which is bad. You'd have
to stop the host, confirm everything is shut down, then remove that 137GB
worth of input files if they still exist.

Also, please upgrade to 2.1.20. Your life will probably be much easier
because of it.

As with all things, these are personal opinions, I cant guarantee they're
safe, manually mucking around with database data files is scary, make sure
you have a backup, practice in a lab, etc.


> Regards
> Manish
>
>
> On Tue, Jan 21, 2020 at 9:09 PM Nitan Kainth <ni...@gmail.com>
> wrote:
>
>> If you are certain that you don’t need data, your plan is good. Make sure
>> to delete all the files for any given sequence number ie data, index, toc
>> etc
>>
>> Regards,
>>
>> Nitan
>>
>> Cell: 510 449 9629
>>
>> On Jan 21, 2020, at 5:36 AM, manish khandelwal <
>> manishkhandelwal03@gmail.com> wrote:
>>
>> 
>> Hi Team
>>
>> I am observing some obsolete files in Cassandra 2.0.14 which are already
>> compacted but not removed from the system after compaction.
>> As per CASSANDRA-7872
>> <https://issues.apache.org/jira/browse/CASSANDRA-7872> , after GC grace
>> period has passed the sstables are open for read again and can lead to data
>> resurrection. I am facing disk crunch  (90% full ) as well and need to
>> remove those obsolete files ASAP.
>>
>>
>> To avoid this what should be our strategy? I am thinking on following
>> lines
>> 1. Stop the Cassandra server.
>> 2. Remove the obsolete files manually.
>> 3. Start the Cassandra server.
>>
>> Regards
>> Manish
>>
>>
>>
>>
>>

Re: Cassandra2.0.14 : Obsolete files not being deleted after compaction

Posted by manish khandelwal <ma...@gmail.com>.
Thanks Nitan,

 Thanks for your reply.

I am using following methodology to find obsolete sstables and just want to
make sure that I don't delete live data if I delete them .

In the following logs I searched for sstable "
keyspace-columnfamily-jb-456789" and found that this "*CompactionExecutor:1957"
*thread compacted  keyspace-columnfamily-jb-123456-Data.db ,
keyspace-columnfamily-jb-234567 -Data.db , keyspace-columnfamily-jb-
345678-Data.db. These files are still present in my data directory so I am
assuming that they are obsolete. I*s my assumption correct*?

INFO [CompactionExecutor:1957] 2020-01-20 06:44:56,721 CompactionTask.java
(line 120) Compacting
[SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
*keyspace-columnfamily-jb-123456-Data.db*'),
SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
*keyspace-columnfamily-jb-234567-Data.db*'),
SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/
*keyspace-columnfamily-jb-345678-Data.db*')]
 INFO [CompactionExecutor:1957] 2020-01-20 12:45:23,270
ColumnFamilyStore.java (line 795) Enqueuing flush of
Memtable-compactions_in_progress@519967741(0/0 serialized/live bytes, 1 ops)
 INFO [*CompactionExecutor:1957*] 2020-01-20 12:45:23,502
CompactionTask.java (line 296) Compacted 3 sstables to
[/var/lib/cassandra/data/keyspace/columnfamily/
*keyspace-columnfamily-jb-456789*,].  136,795,757,524 bytes to
100,529,812,389 (~73% of original) in 21,626,781ms = 4.433055MB/s.
 1,738,999,743 total partitions merged to 1,274,232,528.  Partition merge
counts were {1:1049583261, 2:309997005, 3:23140824, }

Regards
Manish


On Tue, Jan 21, 2020 at 9:09 PM Nitan Kainth <ni...@gmail.com> wrote:

> If you are certain that you don’t need data, your plan is good. Make sure
> to delete all the files for any given sequence number ie data, index, toc
> etc
>
> Regards,
>
> Nitan
>
> Cell: 510 449 9629
>
> On Jan 21, 2020, at 5:36 AM, manish khandelwal <
> manishkhandelwal03@gmail.com> wrote:
>
> 
> Hi Team
>
> I am observing some obsolete files in Cassandra 2.0.14 which are already
> compacted but not removed from the system after compaction.
> As per CASSANDRA-7872
> <https://issues.apache.org/jira/browse/CASSANDRA-7872> , after GC grace
> period has passed the sstables are open for read again and can lead to data
> resurrection. I am facing disk crunch  (90% full ) as well and need to
> remove those obsolete files ASAP.
>
>
> To avoid this what should be our strategy? I am thinking on following lines
> 1. Stop the Cassandra server.
> 2. Remove the obsolete files manually.
> 3. Start the Cassandra server.
>
> Regards
> Manish
>
>
>
>
>

Re: Cassandra2.0.14 : Obsolete files not being deleted after compaction

Posted by Nitan Kainth <ni...@gmail.com>.
If you are certain that you don’t need data, your plan is good. Make sure to delete all the files for any given sequence number ie data, index, toc etc

Regards,
Nitan
Cell: 510 449 9629

> On Jan 21, 2020, at 5:36 AM, manish khandelwal <ma...@gmail.com> wrote:
> 
> 
> Hi Team
> 
> I am observing some obsolete files in Cassandra 2.0.14 which are already compacted but not removed from the system after compaction. 
> As per CASSANDRA-7872 , after GC grace period has passed the sstables are open for read again and can lead to data resurrection. I am facing disk crunch  (90% full ) as well and need to remove those obsolete files ASAP.
> 
> 
> To avoid this what should be our strategy? I am thinking on following lines
> 1. Stop the Cassandra server.
> 2. Remove the obsolete files manually.
> 3. Start the Cassandra server.
> 
> Regards
> Manish
> 
> 
> 
>