You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by "B. Todd Burruss" <bt...@gmail.com> on 2012/11/08 19:12:27 UTC

leveled compaction and tombstoned data

we are having the problem where we have huge SSTABLEs with tombstoned data
in them that is not being compacted soon enough (because size tiered
compaction requires, by default, 4 like sized SSTABLEs).  this is using
more disk space than we anticipated.

we are very write heavy compared to reads, and we delete the data after N
number of days (depends on the column family, but N is around 7 days)

my question is would leveled compaction help to get rid of the tombstoned
data faster than size tiered, and therefore reduce the disk space usage?

thx

Re: leveled compaction and tombstoned data

Posted by Ben Coverston <be...@datastax.com>.

The rules for tombstone eviction are as follows (regardless of your
compaction strategy):

1. gc_grace must be expired, and
2. No other row fragments can exist for the row that aren't also
participating in the compaction.

For LCS, there is no 'rule' that the tombstones can only be evicted at the
highest level. They can be evicted on whichever of the level that the row
converges on. Depending on your use case this may mean it always happens at
level4, it might also mean that it most often happens at L1, or L2.






On Fri, Nov 9, 2012 at 7:31 AM, Mina Naguib <mi...@adgear.com> wrote:

>
>
> On 2012-11-08, at 1:12 PM, B. Todd Burruss <bt...@gmail.com> wrote:
>
> > we are having the problem where we have huge SSTABLEs with tombstoned
> data in them that is not being compacted soon enough (because size tiered
> compaction requires, by default, 4 like sized SSTABLEs).  this is using
> more disk space than we anticipated.
> >
> > we are very write heavy compared to reads, and we delete the data after
> N number of days (depends on the column family, but N is around 7 days)
> >
> > my question is would leveled compaction help to get rid of the
> tombstoned data faster than size tiered, and therefore reduce the disk
> space usage
>
> From my experience, levelled compaction makes space reclamation after
> deletes even less predictable than sized-tier.
>
> The reason is that deletes, like all mutations, are just recorded into
> sstables.  They enter level0, and get slowly, over time, promoted upwards
> to levelN.
>
> Depending on your *total* mutation volume VS your data set size, this may
> be quite a slow process.  This is made even worse if the size of the data
> you're deleting (say, an entire row worth several hundred kilobytes) is
> to-be-deleted by a small row-level tombstone.  If the row is sitting in
> level 4, the tombstone won't impact it until enough data has pushed over
> all existing data in level3, level2, level1, level0
>
> Finally, to guard against the tombstone missing any data, the tombstone
> itself is not candidate for removal (I believe even after gc_grace has
> passed) unless it's reached the highest populated level in levelled
> compaction.  This means if you have 4 levels and issue a ton of deletes
> (even deletes that will never impact existing data), these tombstones are
> deadweight that cannot be purged until they hit level4.
>
> For a write-heavy workload, I recommend you stick with sized-tier.  You
> have several options at your disposal (compaction min/max thresholds,
> gc_grace) to move things along.  If that doesn't help, I've heard of some
> fairly reputable people doing some fairly blasphemous things (major
> compactions every night).
>
>
>


-- 
Ben Coverston
DataStax -- The Apache Cassandra Company

Re: leveled compaction and tombstoned data

Posted by Mina Naguib <mi...@adgear.com>.

On 2012-11-08, at 1:12 PM, B. Todd Burruss <bt...@gmail.com> wrote:

> we are having the problem where we have huge SSTABLEs with tombstoned data in them that is not being compacted soon enough (because size tiered compaction requires, by default, 4 like sized SSTABLEs).  this is using more disk space than we anticipated.
> 
> we are very write heavy compared to reads, and we delete the data after N number of days (depends on the column family, but N is around 7 days)
> 
> my question is would leveled compaction help to get rid of the tombstoned data faster than size tiered, and therefore reduce the disk space usage

From my experience, levelled compaction makes space reclamation after deletes even less predictable than sized-tier.

The reason is that deletes, like all mutations, are just recorded into sstables.  They enter level0, and get slowly, over time, promoted upwards to levelN.

Depending on your *total* mutation volume VS your data set size, this may be quite a slow process.  This is made even worse if the size of the data you're deleting (say, an entire row worth several hundred kilobytes) is to-be-deleted by a small row-level tombstone.  If the row is sitting in level 4, the tombstone won't impact it until enough data has pushed over all existing data in level3, level2, level1, level0

Finally, to guard against the tombstone missing any data, the tombstone itself is not candidate for removal (I believe even after gc_grace has passed) unless it's reached the highest populated level in levelled compaction.  This means if you have 4 levels and issue a ton of deletes (even deletes that will never impact existing data), these tombstones are deadweight that cannot be purged until they hit level4.

For a write-heavy workload, I recommend you stick with sized-tier.  You have several options at your disposal (compaction min/max thresholds, gc_grace) to move things along.  If that doesn't help, I've heard of some fairly reputable people doing some fairly blasphemous things (major compactions every night).

Re: leveled compaction and tombstoned data

Posted by Sylvain Lebresne <sy...@datastax.com>.

On Sat, Nov 10, 2012 at 7:17 PM, Edward Capriolo <ed...@gmail.com>wrote:

> No it does not exist. Rob and I might start a donation page and give
> the money to whoever is willing to code it. If someone would write a
> tool that would split an sstable into 4 smaller sstables (even an
> offline command line tool)


Something like that:
https://github.com/pcmanus/cassandra/commits/sstable_split (adds an
sstablesplit offline tool)


> I would paypal them a hundo.
>

Just tell me how you want to proceed :)

--
Sylvain


>
> On Sat, Nov 10, 2012 at 1:10 PM, Aaron Turner <sy...@gmail.com>
> wrote:
> > Nope.  I think at least once a week I hear someone suggest one way to
> solve
> > their problem is to "write an sstablesplit tool".
> >
> > I'm pretty sure that:
> >
> > Step 1. Write sstablesplit
> > Step 2. ???
> > Step 3. Profit!
> >
> >
> >
> > On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ <ar...@gmail.com>
> wrote:
> >>
> >> @Rob Coli
> >>
> >> Does the "sstablesplit" function exists somewhere ?
> >>
> >>
> >>
> >> 2012/11/10 Jim Cistaro <jc...@netflix.com>
> >>>
> >>> For some of our clusters, we have taken the periodic major compaction
> >>> route.
> >>>
> >>> There are a few things to consider:
> >>> 1) Once you start major compacting, depending on data size, you may be
> >>> committed to doing it periodically because you create one big file that
> >>> will take forever to naturally compact agaist 3 like sized files.
> >>> 2) If you rely heavily on file cache (rather than large row caches),
> each
> >>> major compaction effectively invalidates the entire file cache beause
> >>> everything is written to one new large file.
> >>>
> >>> --
> >>> Jim Cistaro
> >>>
> >>> On 11/9/12 11:27 AM, "Rob Coli" <rc...@palominodb.com> wrote:
> >>>
> >>> >On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss <bt...@gmail.com>
> >>> > wrote:
> >>> >> my question is would leveled compaction help to get rid of the
> >>> >>tombstoned
> >>> >> data faster than size tiered, and therefore reduce the disk space
> >>> >> usage?
> >>> >
> >>> >You could also...
> >>> >
> >>> >1) run a major compaction
> >>> >2) code up sstablesplit
> >>> >3) profit!
> >>> >
> >>> >This method incurs a management penalty if not automated, but is
> >>> >otherwise the most efficient way to deal with tombstones and obsolete
> >>> >data.. :D
> >>> >
> >>> >=Rob
> >>> >
> >>> >--
> >>> >=Robert Coli
> >>> >AIM&GTALK - rcoli@palominodb.com
> >>> >YAHOO - rcoli.palominob
> >>> >SKYPE - rcoli_palominodb
> >>> >
> >>>
> >>
> >
> >
> >
> > --
> > Aaron Turner
> > http://synfin.net/         Twitter: @synfinatic
> > http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
> > Windows
> > Those who would give up essential Liberty, to purchase a little temporary
> > Safety, deserve neither Liberty nor Safety.
> >     -- Benjamin Franklin
> > "carpe diem quam minimum credula postero"
> >
>

Re: leveled compaction and tombstoned data

Posted by Edward Capriolo <ed...@gmail.com>.

No it does not exist. Rob and I might start a donation page and give
the money to whoever is willing to code it. If someone would write a
tool that would split an sstable into 4 smaller sstables (even an
offline command line tool) I would paypal them a hundo.

On Sat, Nov 10, 2012 at 1:10 PM, Aaron Turner <sy...@gmail.com> wrote:
> Nope.  I think at least once a week I hear someone suggest one way to solve
> their problem is to "write an sstablesplit tool".
>
> I'm pretty sure that:
>
> Step 1. Write sstablesplit
> Step 2. ???
> Step 3. Profit!
>
>
>
> On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:
>>
>> @Rob Coli
>>
>> Does the "sstablesplit" function exists somewhere ?
>>
>>
>>
>> 2012/11/10 Jim Cistaro <jc...@netflix.com>
>>>
>>> For some of our clusters, we have taken the periodic major compaction
>>> route.
>>>
>>> There are a few things to consider:
>>> 1) Once you start major compacting, depending on data size, you may be
>>> committed to doing it periodically because you create one big file that
>>> will take forever to naturally compact agaist 3 like sized files.
>>> 2) If you rely heavily on file cache (rather than large row caches), each
>>> major compaction effectively invalidates the entire file cache beause
>>> everything is written to one new large file.
>>>
>>> --
>>> Jim Cistaro
>>>
>>> On 11/9/12 11:27 AM, "Rob Coli" <rc...@palominodb.com> wrote:
>>>
>>> >On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss <bt...@gmail.com>
>>> > wrote:
>>> >> my question is would leveled compaction help to get rid of the
>>> >>tombstoned
>>> >> data faster than size tiered, and therefore reduce the disk space
>>> >> usage?
>>> >
>>> >You could also...
>>> >
>>> >1) run a major compaction
>>> >2) code up sstablesplit
>>> >3) profit!
>>> >
>>> >This method incurs a management penalty if not automated, but is
>>> >otherwise the most efficient way to deal with tombstones and obsolete
>>> >data.. :D
>>> >
>>> >=Rob
>>> >
>>> >--
>>> >=Robert Coli
>>> >AIM&GTALK - rcoli@palominodb.com
>>> >YAHOO - rcoli.palominob
>>> >SKYPE - rcoli_palominodb
>>> >
>>>
>>
>
>
>
> --
> Aaron Turner
> http://synfin.net/         Twitter: @synfinatic
> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
> Windows
> Those who would give up essential Liberty, to purchase a little temporary
> Safety, deserve neither Liberty nor Safety.
>     -- Benjamin Franklin
> "carpe diem quam minimum credula postero"
>

Re: leveled compaction and tombstoned data

Posted by Aaron Turner <sy...@gmail.com>.

Nope.  I think at least once a week I hear someone suggest one way to solve
their problem is to "write an sstablesplit tool".

I'm pretty sure that:

Step 1. Write sstablesplit
Step 2. ???
Step 3. Profit!



On Sat, Nov 10, 2012 at 9:40 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:

> @Rob Coli
>
> Does the "sstablesplit" function exists somewhere ?
>
>
>
> 2012/11/10 Jim Cistaro <jc...@netflix.com>
>
>> For some of our clusters, we have taken the periodic major compaction
>> route.
>>
>> There are a few things to consider:
>> 1) Once you start major compacting, depending on data size, you may be
>> committed to doing it periodically because you create one big file that
>> will take forever to naturally compact agaist 3 like sized files.
>> 2) If you rely heavily on file cache (rather than large row caches), each
>> major compaction effectively invalidates the entire file cache beause
>> everything is written to one new large file.
>>
>> --
>> Jim Cistaro
>>
>> On 11/9/12 11:27 AM, "Rob Coli" <rc...@palominodb.com> wrote:
>>
>> >On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss <bt...@gmail.com>
>> wrote:
>> >> my question is would leveled compaction help to get rid of the
>> >>tombstoned
>> >> data faster than size tiered, and therefore reduce the disk space
>> usage?
>> >
>> >You could also...
>> >
>> >1) run a major compaction
>> >2) code up sstablesplit
>> >3) profit!
>> >
>> >This method incurs a management penalty if not automated, but is
>> >otherwise the most efficient way to deal with tombstones and obsolete
>> >data.. :D
>> >
>> >=Rob
>> >
>> >--
>> >=Robert Coli
>> >AIM&GTALK - rcoli@palominodb.com
>> >YAHOO - rcoli.palominob
>> >SKYPE - rcoli_palominodb
>> >
>>
>>
>


-- 
Aaron Turner
http://synfin.net/         Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
    -- Benjamin Franklin
"carpe diem quam minimum credula postero"

Re: leveled compaction and tombstoned data

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

@Rob Coli

Does the "sstablesplit" function exists somewhere ?


2012/11/10 Jim Cistaro <jc...@netflix.com>

> For some of our clusters, we have taken the periodic major compaction
> route.
>
> There are a few things to consider:
> 1) Once you start major compacting, depending on data size, you may be
> committed to doing it periodically because you create one big file that
> will take forever to naturally compact agaist 3 like sized files.
> 2) If you rely heavily on file cache (rather than large row caches), each
> major compaction effectively invalidates the entire file cache beause
> everything is written to one new large file.
>
> --
> Jim Cistaro
>
> On 11/9/12 11:27 AM, "Rob Coli" <rc...@palominodb.com> wrote:
>
> >On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss <bt...@gmail.com>
> wrote:
> >> my question is would leveled compaction help to get rid of the
> >>tombstoned
> >> data faster than size tiered, and therefore reduce the disk space usage?
> >
> >You could also...
> >
> >1) run a major compaction
> >2) code up sstablesplit
> >3) profit!
> >
> >This method incurs a management penalty if not automated, but is
> >otherwise the most efficient way to deal with tombstones and obsolete
> >data.. :D
> >
> >=Rob
> >
> >--
> >=Robert Coli
> >AIM&GTALK - rcoli@palominodb.com
> >YAHOO - rcoli.palominob
> >SKYPE - rcoli_palominodb
> >
>
>

Re: leveled compaction and tombstoned data

Posted by Jim Cistaro <jc...@netflix.com>.

For some of our clusters, we have taken the periodic major compaction
route.

There are a few things to consider:
1) Once you start major compacting, depending on data size, you may be
committed to doing it periodically because you create one big file that
will take forever to naturally compact agaist 3 like sized files.
2) If you rely heavily on file cache (rather than large row caches), each
major compaction effectively invalidates the entire file cache beause
everything is written to one new large file.

--
Jim Cistaro

On 11/9/12 11:27 AM, "Rob Coli" <rc...@palominodb.com> wrote:

>On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss <bt...@gmail.com> wrote:
>> my question is would leveled compaction help to get rid of the
>>tombstoned
>> data faster than size tiered, and therefore reduce the disk space usage?
>
>You could also...
>
>1) run a major compaction
>2) code up sstablesplit
>3) profit!
>
>This method incurs a management penalty if not automated, but is
>otherwise the most efficient way to deal with tombstones and obsolete
>data.. :D
>
>=Rob
>
>-- 
>=Robert Coli
>AIM&GTALK - rcoli@palominodb.com
>YAHOO - rcoli.palominob
>SKYPE - rcoli_palominodb
>

Re: leveled compaction and tombstoned data

Posted by Rob Coli <rc...@palominodb.com>.

On Thu, Nov 8, 2012 at 10:12 AM, B. Todd Burruss <bt...@gmail.com> wrote:
> my question is would leveled compaction help to get rid of the tombstoned
> data faster than size tiered, and therefore reduce the disk space usage?

You could also...

1) run a major compaction
2) code up sstablesplit
3) profit!

This method incurs a management penalty if not automated, but is
otherwise the most efficient way to deal with tombstones and obsolete
data.. :D

=Rob

-- 
=Robert Coli
AIM&GTALK - rcoli@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb

Re: leveled compaction and tombstoned data

Posted by Jeremy Hanna <je...@gmail.com>.

LCS works well in specific circumstances, this blog post gives some good considerations: http://www.datastax.com/dev/blog/when-to-use-leveled-compaction

On Nov 8, 2012, at 1:33 PM, Aaron Turner <sy...@gmail.com> wrote:

> "kill performance" is relative.  Leveled Compaction basically costs 2x disk IO.  Look at iostat, etc and see if you have the headroom.
> 
> There are also ways to bring up a test node and just run Level Compaction on that.  Wish I had a URL handy, but hopefully someone else can find it.
> 
> Also, if you're not using compression, check it out.
> 
> On Thu, Nov 8, 2012 at 11:20 AM, B. Todd Burruss <bt...@gmail.com> wrote:
> we are running Datastax enterprise and cannot patch it.  how bad is
> "kill performance"?  if it is so bad, why is it an option?
> 
> 
> On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar <hs...@filez.com> wrote:
> > Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
> >
> >> my question is would leveled compaction help to get rid of the tombstoned
> >> data faster than size tiered, and therefore reduce the disk space usage?
> >>
> > leveled compaction will kill your performance. get patch from jira for
> > maximum sstable size per CF and force cassandra to make smaller tables, they
> > expire faster.
> >
> 
> 
> 
> -- 
> Aaron Turner
> http://synfin.net/         Twitter: @synfinatic
> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows
> Those who would give up essential Liberty, to purchase a little temporary 
> Safety, deserve neither Liberty nor Safety.  
>     -- Benjamin Franklin
> "carpe diem quam minimum credula postero"
>

Re: leveled compaction and tombstoned data

Posted by Ben Coverston <be...@datastax.com>.

http://www.datastax.com/docs/1.1/operations/tuning#testing-compaction-and-compression

Write Survey mode.

After you have it up and running you can modify the column family mbean to
use LeveledCompactionStrategy on that node to see how your hardware/load
fares with LCS.


On Thu, Nov 8, 2012 at 11:33 AM, Aaron Turner <sy...@gmail.com> wrote:

> "kill performance" is relative.  Leveled Compaction basically costs 2x
> disk IO.  Look at iostat, etc and see if you have the headroom.
>
> There are also ways to bring up a test node and just run Level Compaction
> on that.  Wish I had a URL handy, but hopefully someone else can find it.
>
> Also, if you're not using compression, check it out.
>
>
> On Thu, Nov 8, 2012 at 11:20 AM, B. Todd Burruss <bt...@gmail.com> wrote:
>
>> we are running Datastax enterprise and cannot patch it.  how bad is
>> "kill performance"?  if it is so bad, why is it an option?
>>
>>
>> On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar <hs...@filez.com> wrote:
>> > Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
>> >
>> >> my question is would leveled compaction help to get rid of the
>> tombstoned
>> >> data faster than size tiered, and therefore reduce the disk space
>> usage?
>> >>
>> > leveled compaction will kill your performance. get patch from jira for
>> > maximum sstable size per CF and force cassandra to make smaller tables,
>> they
>> > expire faster.
>> >
>>
>
>
>
> --
> Aaron Turner
> http://synfin.net/         Twitter: @synfinatic
> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
> Windows
> Those who would give up essential Liberty, to purchase a little temporary
> Safety, deserve neither Liberty nor Safety.
>     -- Benjamin Franklin
> "carpe diem quam minimum credula postero"
>
>


-- 
Ben Coverston
DataStax -- The Apache Cassandra Company

Re: leveled compaction and tombstoned data

Posted by "B. Todd Burruss" <bt...@gmail.com>.

@ben, thx, we will be deploying 2.2.1 of DSE soon and will try to
setup a traffic sampling node so we can test leveled compaction.

we essentially keep a rolling window of data written once.  it is
written, then after N days it is deleted, so it seems that leveled
compaction should help

On Thu, Nov 8, 2012 at 11:53 AM, B. Todd Burruss <bt...@gmail.com> wrote:
> thanks for the links!  i had forgotten about live sampling
>
> On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams <dr...@gmail.com> wrote:
>> On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner <sy...@gmail.com> wrote:
>>> There are also ways to bring up a test node and just run Level Compaction on
>>> that.  Wish I had a URL handy, but hopefully someone else can find it.
>>
>> This rather handsome fellow wrote a blog about it:
>> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling
>>
>> -Brandon

Re: leveled compaction and tombstoned data

Posted by "B. Todd Burruss" <bt...@gmail.com>.

thanks for the links!  i had forgotten about live sampling

On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams <dr...@gmail.com> wrote:
> On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner <sy...@gmail.com> wrote:
>> There are also ways to bring up a test node and just run Level Compaction on
>> that.  Wish I had a URL handy, but hopefully someone else can find it.
>
> This rather handsome fellow wrote a blog about it:
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling
>
> -Brandon

Re: leveled compaction and tombstoned data

Posted by Radim Kolar <hs...@filez.com>.

> I would be careful with the patch that was referred to above, it 
> hasn't been reviewed, and from a glance it appears that it will cause 
> an infinite compaction loop if you get more than 4 SSTables at max size.
it will, you need to setup max sstable size correctly.

Re: leveled compaction and tombstoned data

Posted by Ben Coverston <be...@datastax.com>.

Also to answer your question, LCS is well suited to workloads where
overwrites and tombstones come into play. The tombstones are _much_ more
likely to be merged with LCS than STCS.

I would be careful with the patch that was referred to above, it hasn't
been reviewed, and from a glance it appears that it will cause an infinite
compaction loop if you get more than 4 SSTables at max size.

On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams <dr...@gmail.com> wrote:

> On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner <sy...@gmail.com> wrote:
> > There are also ways to bring up a test node and just run Level
> Compaction on
> > that.  Wish I had a URL handy, but hopefully someone else can find it.
>
> This rather handsome fellow wrote a blog about it:
>
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling
>
> -Brandon
>

-- 
Ben Coverston
DataStax -- The Apache Cassandra Company

Re: leveled compaction and tombstoned data

Posted by Brandon Williams <dr...@gmail.com>.

On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner <sy...@gmail.com> wrote:
> There are also ways to bring up a test node and just run Level Compaction on
> that.  Wish I had a URL handy, but hopefully someone else can find it.

This rather handsome fellow wrote a blog about it:
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-live-traffic-sampling

-Brandon

Re: leveled compaction and tombstoned data

Posted by Aaron Turner <sy...@gmail.com>.

"kill performance" is relative.  Leveled Compaction basically costs 2x disk
IO.  Look at iostat, etc and see if you have the headroom.

There are also ways to bring up a test node and just run Level Compaction
on that.  Wish I had a URL handy, but hopefully someone else can find it.

Also, if you're not using compression, check it out.

On Thu, Nov 8, 2012 at 11:20 AM, B. Todd Burruss <bt...@gmail.com> wrote:

> we are running Datastax enterprise and cannot patch it.  how bad is
> "kill performance"?  if it is so bad, why is it an option?
>
>
> On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar <hs...@filez.com> wrote:
> > Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
> >
> >> my question is would leveled compaction help to get rid of the
> tombstoned
> >> data faster than size tiered, and therefore reduce the disk space usage?
> >>
> > leveled compaction will kill your performance. get patch from jira for
> > maximum sstable size per CF and force cassandra to make smaller tables,
> they
> > expire faster.
> >
>



-- 
Aaron Turner
http://synfin.net/         Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
    -- Benjamin Franklin
"carpe diem quam minimum credula postero"

Re: leveled compaction and tombstoned data

Posted by "B. Todd Burruss" <bt...@gmail.com>.

we are running Datastax enterprise and cannot patch it.  how bad is
"kill performance"?  if it is so bad, why is it an option?


On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar <hs...@filez.com> wrote:
> Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
>
>> my question is would leveled compaction help to get rid of the tombstoned
>> data faster than size tiered, and therefore reduce the disk space usage?
>>
> leveled compaction will kill your performance. get patch from jira for
> maximum sstable size per CF and force cassandra to make smaller tables, they
> expire faster.
>

Re: leveled compaction and tombstoned data

Posted by Radim Kolar <hs...@filez.com>.

Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
> my question is would leveled compaction help to get rid of the 
> tombstoned data faster than size tiered, and therefore reduce the disk 
> space usage?
>
leveled compaction will kill your performance. get patch from jira for 
maximum sstable size per CF and force cassandra to make smaller tables, 
they expire faster.