You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Omer van der Horst Jansen <om...@gmail.com> on 2011/02/03 16:45:50 UTC

Mitigating CASSANDRA-2059 -- leftover files

Jonathan pointed out in another thread that it looks like I'm running
into CASSANDRA-2059, where secondary files are not being properly
deleted. My production data set at any given time is less than 100 MB
in size, but the Cassandra data directories on each instance are using
30 to 40 times as much space right now, and steadily growing.

I understand I can remove the root cause of the problem by applying
the patch that's attached to the bug report or by upgrading to  0.7.1
when it's out.

In the meantime, is it safe to manually delete stale files while
Cassandra is running?  And how do I determine when a set of files is
stale?

I'd assume that a given set of files is deletable if there is no
-Data.db file and the -Compacted file has zero length.

Example of what I would think is a set of stale files, without a -Data,db file:

ls -l *3090*
-rw-rw-r-- 1 user group    0 Feb  3 10:00 Payload-e-3090-Compacted
-rw-rw-r-- 1 user group  245 Feb  3 10:00 Payload-e-3090-Filter.db
-rw-rw-r-- 1 user group 4362 Feb  3 10:00 Payload-e-3090-Index.db
-rw-rw-r-- 1 user group 4840 Feb  3 10:00 Payload-e-3090-Statistics.db

I've got these all the way back to  Payload-e-1-Index.db.

Non-stale files:
ls -l *3095*
-rw-rw-r-- 1 user group        0 Feb  3 10:35 Payload-e-3095-Compacted
-rw-rw-r-- 1 user group 41269735 Feb  3 10:14 Payload-e-3095-Data.db
-rw-rw-r-- 1 user group   286405 Feb  3 10:14 Payload-e-3095-Filter.db
-rw-rw-r-- 1 user group  7608022 Feb  3 10:14 Payload-e-3095-Index.db
-rw-rw-r-- 1 user group     4840 Feb  3 10:14 Payload-e-3095-Statistics.db

There is an active Data.db file, so I'd leave this group alone.

--Omer

Re: Mitigating CASSANDRA-2059 -- leftover files

Posted by Jonathan Ellis <jb...@gmail.com>.
On Thu, Feb 3, 2011 at 7:45 AM, Omer van der Horst Jansen
<om...@gmail.com> wrote:
> In the meantime, is it safe to manually delete stale files while
> Cassandra is running?  And how do I determine when a set of files is
> stale?
>
> I'd assume that a given set of files is deletable if there is no
> -Data.db file and the -Compacted file has zero length.

Yes, that should work.

Also, restarting a node will clean them out.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com