You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Schubert Zhang <zs...@gmail.com> on 2010/05/11 17:50:59 UTC

Why not to delete the "Compacted" file immediately? What is the policy in 0.6.1?

In current 0.6.1, after a long time of compation, the old SSTable files are
still there, with the mark of
"CFName-id-Compacted" zero sized file.

Whey not delete them immediately? What is the policy in 0.6.1?

See following examples.

-rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-1263-Compacted
-rw-rw-r-- 1 cassandra cassandra  3643422558 May 11 12:21 LZO-1263-Data.db
-rw-rw-r-- 1 cassandra cassandra      939685 May 11 12:21 LZO-1263-Filter.db
-rw-rw-r-- 1 cassandra cassandra     6034764 May 11 12:21 LZO-1263-Index.db
-rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-1597-Compacted
-rw-rw-r-- 1 cassandra cassandra  2365082186 May 11 14:49 LZO-1597-Data.db
-rw-rw-r-- 1 cassandra cassandra      751765 May 11 14:49 LZO-1597-Filter.db
-rw-rw-r-- 1 cassandra cassandra     6034764 May 11 14:49 LZO-1597-Index.db
-rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-1614-Compacted
-rw-rw-r-- 1 cassandra cassandra 31064427729 May 11 16:42 LZO-1614-Data.db
-rw-rw-r-- 1 cassandra cassandra      751765 May 11 16:42 LZO-1614-Filter.db
-rw-rw-r-- 1 cassandra cassandra     6034764 May 11 16:42 LZO-1614-Index.db
-rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-1879-Compacted
-rw-rw-r-- 1 cassandra cassandra  3536115771 May 11 17:29 LZO-1879-Data.db
-rw-rw-r-- 1 cassandra cassandra      751765 May 11 17:29 LZO-1879-Filter.db
-rw-rw-r-- 1 cassandra cassandra     6034764 May 11 17:29 LZO-1879-Index.db
-rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-2049-Compacted
-rw-rw-r-- 1 cassandra cassandra  7910248440 May 11 19:17 LZO-2049-Data.db
-rw-rw-r-- 1 cassandra cassandra     1691365 May 11 19:17 LZO-2049-Filter.db
-rw-rw-r-- 1 cassandra cassandra     6034764 May 11 19:17 LZO-2049-Index.db
-rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-2169-Compacted
-rw-rw-r-- 1 cassandra cassandra   417646680 May 11 19:56 LZO-2169-Data.db
-rw-rw-r-- 1 cassandra cassandra     2500645 May 11 19:56 LZO-2169-Filter.db
-rw-rw-r-- 1 cassandra cassandra     6034764 May 11 19:56 LZO-2169-Index.db
-rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-2184-Compacted
-rw-rw-r-- 1 cassandra cassandra  3485810361 May 11 20:08 LZO-2184-Data.db
-rw-rw-r-- 1 cassandra cassandra      751765 May 11 20:08 LZO-2184-Filter.db
-rw-rw-r-- 1 cassandra cassandra     6034764 May 11 20:08 LZO-2184-Index.db
-rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-2188-Compacted
-rw-rw-r-- 1 cassandra cassandra   663325064 May 11 20:10 LZO-2188-Data.db
-rw-rw-r-- 1 cassandra cassandra      751765 May 11 20:10 LZO-2188-Filter.db
-rw-rw-r-- 1 cassandra cassandra     6034764 May 11 20:10 LZO-2188-Index.db
-rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-2189-Compacted
-rw-rw-r-- 1 cassandra cassandra    29651624 May 11 20:09 LZO-2189-Data.db
-rw-rw-r-- 1 cassandra cassandra      114789 May 11 20:09 LZO-2189-Filter.db
-rw-rw-r-- 1 cassandra cassandra     3691960 May 11 20:09 LZO-2189-Index.db
-rw-rw-r-- 1 cassandra cassandra 52910891687 May 11 23:35 LZO-2190-Data.db
-rw-rw-r-- 1 cassandra cassandra     1618405 May 11 23:35 LZO-2190-Filter.db
-rw-rw-r-- 1 cassandra cassandra     6034764 May 11 23:35 LZO-2190-Index.db

Re: Why not to delete the "Compacted" file immediately? What is the policy in 0.6.1?

Posted by Schubert Zhang <zs...@gmail.com>.
Thanks jonathan, clear!

On Wed, May 12, 2010 at 12:22 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> from http://wiki.apache.org/cassandra/ArchitectureInternals:
>
> Making this concurrency-safe without blocking writes or reads while we
> remove the old SSTables from the list and add the new one is tricky,
> because naive approaches require waiting for all readers of the old
> sstables to finish before deleting them (since we can't know if they
> have actually started opening the file yet; if they have not and we
> delete the file first, they will error out). The approach we have
> settled on is to not actually delete old SSTables synchronously;
> instead we register a phantom reference with the garbage collector, so
> when no references to the SSTable exist it will be deleted. (We also
> write a compaction marker to the file system so if the server is
> restarted before that happens, we clean out the old SSTables at
> startup time.)
>
> On Tue, May 11, 2010 at 10:50 AM, Schubert Zhang <zs...@gmail.com>
> wrote:
> > In current 0.6.1, after a long time of compation, the old SSTable files
> are
> > still there, with the mark of
> > "CFName-id-Compacted" zero sized file.
> >
> > Whey not delete them immediately? What is the policy in 0.6.1?
> >
> > See following examples.
> >
> > -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35
> LZO-1263-Compacted
> > -rw-rw-r-- 1 cassandra cassandra  3643422558 May 11 12:21
> LZO-1263-Data.db
> > -rw-rw-r-- 1 cassandra cassandra      939685 May 11 12:21
> LZO-1263-Filter.db
> > -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 12:21
> LZO-1263-Index.db
> > -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35
> LZO-1597-Compacted
> > -rw-rw-r-- 1 cassandra cassandra  2365082186 May 11 14:49
> LZO-1597-Data.db
> > -rw-rw-r-- 1 cassandra cassandra      751765 May 11 14:49
> LZO-1597-Filter.db
> > -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 14:49
> LZO-1597-Index.db
> > -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35
> LZO-1614-Compacted
> > -rw-rw-r-- 1 cassandra cassandra 31064427729 May 11 16:42
> LZO-1614-Data.db
> > -rw-rw-r-- 1 cassandra cassandra      751765 May 11 16:42
> LZO-1614-Filter.db
> > -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 16:42
> LZO-1614-Index.db
> > -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35
> LZO-1879-Compacted
> > -rw-rw-r-- 1 cassandra cassandra  3536115771 May 11 17:29
> LZO-1879-Data.db
> > -rw-rw-r-- 1 cassandra cassandra      751765 May 11 17:29
> LZO-1879-Filter.db
> > -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 17:29
> LZO-1879-Index.db
> > -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35
> LZO-2049-Compacted
> > -rw-rw-r-- 1 cassandra cassandra  7910248440 May 11 19:17
> LZO-2049-Data.db
> > -rw-rw-r-- 1 cassandra cassandra     1691365 May 11 19:17
> LZO-2049-Filter.db
> > -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 19:17
> LZO-2049-Index.db
> > -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35
> LZO-2169-Compacted
> > -rw-rw-r-- 1 cassandra cassandra   417646680 May 11 19:56
> LZO-2169-Data.db
> > -rw-rw-r-- 1 cassandra cassandra     2500645 May 11 19:56
> LZO-2169-Filter.db
> > -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 19:56
> LZO-2169-Index.db
> > -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35
> LZO-2184-Compacted
> > -rw-rw-r-- 1 cassandra cassandra  3485810361 May 11 20:08
> LZO-2184-Data.db
> > -rw-rw-r-- 1 cassandra cassandra      751765 May 11 20:08
> LZO-2184-Filter.db
> > -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 20:08
> LZO-2184-Index.db
> > -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35
> LZO-2188-Compacted
> > -rw-rw-r-- 1 cassandra cassandra   663325064 May 11 20:10
> LZO-2188-Data.db
> > -rw-rw-r-- 1 cassandra cassandra      751765 May 11 20:10
> LZO-2188-Filter.db
> > -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 20:10
> LZO-2188-Index.db
> > -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35
> LZO-2189-Compacted
> > -rw-rw-r-- 1 cassandra cassandra    29651624 May 11 20:09
> LZO-2189-Data.db
> > -rw-rw-r-- 1 cassandra cassandra      114789 May 11 20:09
> LZO-2189-Filter.db
> > -rw-rw-r-- 1 cassandra cassandra     3691960 May 11 20:09
> LZO-2189-Index.db
> > -rw-rw-r-- 1 cassandra cassandra 52910891687 May 11 23:35
> LZO-2190-Data.db
> > -rw-rw-r-- 1 cassandra cassandra     1618405 May 11 23:35
> LZO-2190-Filter.db
> > -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 23:35
> LZO-2190-Index.db
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: Why not to delete the "Compacted" file immediately? What is the policy in 0.6.1?

Posted by Jonathan Ellis <jb...@gmail.com>.
from http://wiki.apache.org/cassandra/ArchitectureInternals:

Making this concurrency-safe without blocking writes or reads while we
remove the old SSTables from the list and add the new one is tricky,
because naive approaches require waiting for all readers of the old
sstables to finish before deleting them (since we can't know if they
have actually started opening the file yet; if they have not and we
delete the file first, they will error out). The approach we have
settled on is to not actually delete old SSTables synchronously;
instead we register a phantom reference with the garbage collector, so
when no references to the SSTable exist it will be deleted. (We also
write a compaction marker to the file system so if the server is
restarted before that happens, we clean out the old SSTables at
startup time.)

On Tue, May 11, 2010 at 10:50 AM, Schubert Zhang <zs...@gmail.com> wrote:
> In current 0.6.1, after a long time of compation, the old SSTable files are
> still there, with the mark of
> "CFName-id-Compacted" zero sized file.
>
> Whey not delete them immediately? What is the policy in 0.6.1?
>
> See following examples.
>
> -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-1263-Compacted
> -rw-rw-r-- 1 cassandra cassandra  3643422558 May 11 12:21 LZO-1263-Data.db
> -rw-rw-r-- 1 cassandra cassandra      939685 May 11 12:21 LZO-1263-Filter.db
> -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 12:21 LZO-1263-Index.db
> -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-1597-Compacted
> -rw-rw-r-- 1 cassandra cassandra  2365082186 May 11 14:49 LZO-1597-Data.db
> -rw-rw-r-- 1 cassandra cassandra      751765 May 11 14:49 LZO-1597-Filter.db
> -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 14:49 LZO-1597-Index.db
> -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-1614-Compacted
> -rw-rw-r-- 1 cassandra cassandra 31064427729 May 11 16:42 LZO-1614-Data.db
> -rw-rw-r-- 1 cassandra cassandra      751765 May 11 16:42 LZO-1614-Filter.db
> -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 16:42 LZO-1614-Index.db
> -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-1879-Compacted
> -rw-rw-r-- 1 cassandra cassandra  3536115771 May 11 17:29 LZO-1879-Data.db
> -rw-rw-r-- 1 cassandra cassandra      751765 May 11 17:29 LZO-1879-Filter.db
> -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 17:29 LZO-1879-Index.db
> -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-2049-Compacted
> -rw-rw-r-- 1 cassandra cassandra  7910248440 May 11 19:17 LZO-2049-Data.db
> -rw-rw-r-- 1 cassandra cassandra     1691365 May 11 19:17 LZO-2049-Filter.db
> -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 19:17 LZO-2049-Index.db
> -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-2169-Compacted
> -rw-rw-r-- 1 cassandra cassandra   417646680 May 11 19:56 LZO-2169-Data.db
> -rw-rw-r-- 1 cassandra cassandra     2500645 May 11 19:56 LZO-2169-Filter.db
> -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 19:56 LZO-2169-Index.db
> -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-2184-Compacted
> -rw-rw-r-- 1 cassandra cassandra  3485810361 May 11 20:08 LZO-2184-Data.db
> -rw-rw-r-- 1 cassandra cassandra      751765 May 11 20:08 LZO-2184-Filter.db
> -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 20:08 LZO-2184-Index.db
> -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-2188-Compacted
> -rw-rw-r-- 1 cassandra cassandra   663325064 May 11 20:10 LZO-2188-Data.db
> -rw-rw-r-- 1 cassandra cassandra      751765 May 11 20:10 LZO-2188-Filter.db
> -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 20:10 LZO-2188-Index.db
> -rw-rw-r-- 1 cassandra cassandra           0 May 11 23:35 LZO-2189-Compacted
> -rw-rw-r-- 1 cassandra cassandra    29651624 May 11 20:09 LZO-2189-Data.db
> -rw-rw-r-- 1 cassandra cassandra      114789 May 11 20:09 LZO-2189-Filter.db
> -rw-rw-r-- 1 cassandra cassandra     3691960 May 11 20:09 LZO-2189-Index.db
> -rw-rw-r-- 1 cassandra cassandra 52910891687 May 11 23:35 LZO-2190-Data.db
> -rw-rw-r-- 1 cassandra cassandra     1618405 May 11 23:35 LZO-2190-Filter.db
> -rw-rw-r-- 1 cassandra cassandra     6034764 May 11 23:35 LZO-2190-Index.db
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com