You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Roshan <co...@gmail.com> on 2012/01/24 00:46:20 UTC

SSTable compaction issue in our system

Hi

We have deployed two node Cassandra 1.0.6 cluster to production and it
create SSTables on daily with different sizes. As I know, Cassandra will
compact 4 (as default compaction threshold) same size files identified by
compaction task. But every time in my system it identify 50MB 4 files and
compact it to some size (e.g. 200MB) SSTable by removing tombstones. But
next time it will compact 50MB 4 size file to another size (e.g. 100MB). If
the compact task create such different size files after removing tombstones,
some different size files are remain in system and not identified by
compaction task. 

I think major compaction using nodetool is not recommend for Cassandra 1.0.X
versions. So could you pelase advice me how to combine different size
SSTables together. Thanks. 

--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/SSTable-compaction-issue-in-our-system-tp7218239p7218239.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Re: SSTable compaction issue in our system

Posted by aaron morton <aa...@thelastpickle.com>.
There is no way to reverse a compaction. 

You can initiate a user compaction on a single file though, see nodetool (i think) or the JMX interface. 
Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 1/02/2012, at 4:10 AM, Micah Hausler wrote:

> A related question, is there any way to reverse a major compaction without loosing performance? Do I just have to wait it out?
> 
> Micah Hausler
> 
> On Jan 30, 2012, at 7:50 PM, Roshan Pradeep wrote:
> 
>> Thanks Aaron for the perfect explanation. Decided to go with automatic compaction. Thanks again.
>> 
>> On Wed, Jan 25, 2012 at 11:19 AM, aaron morton <aa...@thelastpickle.com> wrote:
>> The issue with major / manual compaction is that it creates a one file. One big old file.  
>> 
>> That one file will not be compacted unless there are (min_compaction_threshold -1) other files of a similar size. So thombstones and overwrites in that file may not be purged for a long time. 
>> 
>> If you go down the manual compaction path you need to keep doing it.
>> 
>> If you feel you need to do it do it, otherwise let automatic compaction do it's thing. 
>> Cheers
>>   
>>   
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 25/01/2012, at 12:47 PM, Roshan wrote:
>> 
>>> Thanks for the reply. Is the major compaction not recommended for Cassandra
>>> 1.0.6?
>>> 
>>> --
>>> View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Different-size-of-SSTable-are-remain-in-the-system-without-compact-tp7218239p7222322.html
>>> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
>> 
>> 
> 


Re: SSTable compaction issue in our system

Posted by Micah Hausler <mi...@retickr.com>.
A related question, is there any way to reverse a major compaction without loosing performance? Do I just have to wait it out?

Micah Hausler

On Jan 30, 2012, at 7:50 PM, Roshan Pradeep wrote:

> Thanks Aaron for the perfect explanation. Decided to go with automatic compaction. Thanks again.
> 
> On Wed, Jan 25, 2012 at 11:19 AM, aaron morton <aa...@thelastpickle.com> wrote:
> The issue with major / manual compaction is that it creates a one file. One big old file.  
> 
> That one file will not be compacted unless there are (min_compaction_threshold -1) other files of a similar size. So thombstones and overwrites in that file may not be purged for a long time. 
> 
> If you go down the manual compaction path you need to keep doing it.
> 
> If you feel you need to do it do it, otherwise let automatic compaction do it's thing. 
> Cheers
>   
>   
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 25/01/2012, at 12:47 PM, Roshan wrote:
> 
>> Thanks for the reply. Is the major compaction not recommended for Cassandra
>> 1.0.6?
>> 
>> --
>> View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Different-size-of-SSTable-are-remain-in-the-system-without-compact-tp7218239p7222322.html
>> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.
> 
> 


Re: SSTable compaction issue in our system

Posted by Roshan Pradeep <co...@gmail.com>.
Thanks Aaron for the perfect explanation. Decided to go with automatic
compaction. Thanks again.

On Wed, Jan 25, 2012 at 11:19 AM, aaron morton <aa...@thelastpickle.com>wrote:

> The issue with major / manual compaction is that it creates a one file.
> One big old file.
>
> That one file will not be compacted unless there are
> (min_compaction_threshold -1) other files of a similar size. So thombstones
> and overwrites in that file may not be purged for a long time.
>
> If you go down the manual compaction path you need to keep doing it.
>
> If you feel you need to do it do it, otherwise let automatic compaction do
> it's thing.
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 25/01/2012, at 12:47 PM, Roshan wrote:
>
> Thanks for the reply. Is the major compaction not recommended for Cassandra
> 1.0.6?
>
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Different-size-of-SSTable-are-remain-in-the-system-without-compact-tp7218239p7222322.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at
> Nabble.com.
>
>
>

Re: SSTable compaction issue in our system

Posted by aaron morton <aa...@thelastpickle.com>.
The issue with major / manual compaction is that it creates a one file. One big old file.  

That one file will not be compacted unless there are (min_compaction_threshold -1) other files of a similar size. So thombstones and overwrites in that file may not be purged for a long time. 

If you go down the manual compaction path you need to keep doing it.

If you feel you need to do it do it, otherwise let automatic compaction do it's thing. 
Cheers
  
  
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/01/2012, at 12:47 PM, Roshan wrote:

> Thanks for the reply. Is the major compaction not recommended for Cassandra
> 1.0.6?
> 
> --
> View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Different-size-of-SSTable-are-remain-in-the-system-without-compact-tp7218239p7222322.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.


Re: SSTable compaction issue in our system

Posted by Roshan <co...@gmail.com>.
Thanks for the reply. Is the major compaction not recommended for Cassandra
1.0.6?

--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Different-size-of-SSTable-are-remain-in-the-system-without-compact-tp7218239p7222322.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Re: SSTable compaction issue in our system

Posted by aaron morton <aa...@thelastpickle.com>.
When the default compaction strategy the SSTables are grouped into buckets, where the size of every sstable int he bucket is within 50% of the average size of files in the bucket. There is also a catch all first bucket for all files less than 50MB (by default). 

The min_compaction_threshold CF settings applies to the number of files in each bucket. 

So in your case you would have the following buckets:
- 4 * 50MB 
- 1 * 100MB
- 1 * 200MB 

It would compact the first bucket and create a file that would be in a bucket with one of the other two files. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/01/2012, at 12:46 PM, Roshan wrote:

> Hi
> 
> We have deployed two node Cassandra 1.0.6 cluster to production and it
> create SSTables on daily with different sizes. As I know, Cassandra will
> compact 4 (as default compaction threshold) same size files identified by
> compaction task. But every time in my system it identify 50MB 4 files and
> compact it to some size (e.g. 200MB) SSTable by removing tombstones. But
> next time it will compact 50MB 4 size file to another size (e.g. 100MB). If
> the compact task create such different size files after removing tombstones,
> some different size files are remain in system and not identified by
> compaction task. 
> 
> I think major compaction using nodetool is not recommend for Cassandra 1.0.X
> versions. So could you pelase advice me how to combine different size
> SSTables together. Thanks. 
> 
> --
> View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/SSTable-compaction-issue-in-our-system-tp7218239p7218239.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.