You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Andrei Ivanov <ai...@iponweb.net> on 2014/11/18 14:06:16 UTC

LCS: sstables grow larger

Dear all,

I have the following problem:
- C* 2.0.11
- LCS with default 160MB
- Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
- Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)

I would expect the sstables to be of +- maximum 160MB. Despite this I
see files like:
192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
or
631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db

Am I missing something? What could be the reason? (Actually this is a
"fresh" cluster - on an "old" one I'm seeing 500GB sstables). I'm
getting really desperate in my attempt to understand what's going on.

Thanks in advance Andrei.

Re: LCS: sstables grow larger

Posted by Andrei Ivanov <ai...@iponweb.net>.

Amazing how I missed the -Dcassandra.disable_stcs_in_l0=true option -
I have LeveledManifest source opened the whole day;-)

On Tue, Nov 18, 2014 at 9:15 PM, Andrei Ivanov <ai...@iponweb.net> wrote:
> Thanks a lot for your support, Marcus - that is useful beyond all
> recognition!;-) And I will try #6621 right away.
>
> Sincerely, Andrei.
>
> On Tue, Nov 18, 2014 at 8:50 PM, Marcus Eriksson <kr...@gmail.com> wrote:
>> you should stick to as small nodes as possible yes :)
>>
>> There are a few relevant tickets related to bootstrap and LCS:
>> https://issues.apache.org/jira/browse/CASSANDRA-6621 - startup with
>> -Dcassandra.disable_stcs_in_l0=true to not do STCS in L0
>> https://issues.apache.org/jira/browse/CASSANDRA-7460 - (3.0) send source
>> sstable level when bootstrapping
>>
>> On Tue, Nov 18, 2014 at 3:33 PM, Andrei Ivanov <ai...@iponweb.net> wrote:
>>>
>>> OK, got it.
>>>
>>> Actually, my problem is not that we constantly having many files at
>>> L0. Normally, quite a few of them - that is, nodes are managing to
>>> compact incoming writes in a timely manner.
>>>
>>> But it looks like when we join a new node, it receives tons of files
>>> from existing nodes (and they end up at L0, right?) and that seems to
>>> be where our problems start. In practice, in what I call the "old"
>>> cluster, compaction became a problem at ~2TB nodes. (You, know, we are
>>> trying to save something on HW - we are running on EC2 with EBS
>>> volumes)
>>>
>>> Do I get it right that, we better stick to cmaller nodes?
>>>
>>>
>>>
>>> On Tue, Nov 18, 2014 at 5:20 PM, Marcus Eriksson <kr...@gmail.com>
>>> wrote:
>>> > No, they will get compacted into smaller sstables in L1+ eventually
>>> > (once
>>> > you have less than 32 sstables in L0, an ordinary L0 -> L1 compaction
>>> > will
>>> > happen)
>>> >
>>> > But, if you consistently get many files in L0 it means that compaction
>>> > is
>>> > not keeping up with your inserts and you should probably expand your
>>> > cluster
>>> > (or consider going back to SizeTieredCompactionStrategy for the tables
>>> > that
>>> > take that many writes)
>>> >
>>> > /Marcus
>>> >
>>> > On Tue, Nov 18, 2014 at 2:49 PM, Andrei Ivanov <ai...@iponweb.net>
>>> > wrote:
>>> >>
>>> >> Marcus, thanks a lot! It explains a lot those huge tables are indeed at
>>> >> L0.
>>> >>
>>> >> It seems that they start to appear as a result of some "massive"
>>> >> operations (join, repair, rebuild). What's their fate in the future?
>>> >> Will they continue to propagate like this through levels? Is there
>>> >> anything that can be done to avoid/solve/prevent this?
>>> >>
>>> >> My fears here are around a feeling that those big tables (like in my
>>> >> "old" cluster) will be hardly compactable in the future...
>>> >>
>>> >> Sincerely, Andrei.
>>> >>
>>> >> On Tue, Nov 18, 2014 at 4:27 PM, Marcus Eriksson <kr...@gmail.com>
>>> >> wrote:
>>> >> > I suspect they are getting size tiered in L0 - if you have too many
>>> >> > sstables
>>> >> > in L0, we will do size tiered compaction on sstables in L0 to improve
>>> >> > performance
>>> >> >
>>> >> > Use tools/bin/sstablemetadata to get the level for those sstables, if
>>> >> > they
>>> >> > are in L0, that is probably the reason.
>>> >> >
>>> >> > /Marcus
>>> >> >
>>> >> > On Tue, Nov 18, 2014 at 2:06 PM, Andrei Ivanov <ai...@iponweb.net>
>>> >> > wrote:
>>> >> >>
>>> >> >> Dear all,
>>> >> >>
>>> >> >> I have the following problem:
>>> >> >> - C* 2.0.11
>>> >> >> - LCS with default 160MB
>>> >> >> - Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
>>> >> >> - Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)
>>> >> >>
>>> >> >> I would expect the sstables to be of +- maximum 160MB. Despite this
>>> >> >> I
>>> >> >> see files like:
>>> >> >> 192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
>>> >> >> or
>>> >> >> 631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db
>>> >> >>
>>> >> >> Am I missing something? What could be the reason? (Actually this is
>>> >> >> a
>>> >> >> "fresh" cluster - on an "old" one I'm seeing 500GB sstables). I'm
>>> >> >> getting really desperate in my attempt to understand what's going
>>> >> >> on.
>>> >> >>
>>> >> >> Thanks in advance Andrei.
>>> >> >
>>> >> >
>>> >
>>> >
>>
>>

Re: LCS: sstables grow larger

Posted by Andrei Ivanov <ai...@iponweb.net>.

Thanks a lot for your support, Marcus - that is useful beyond all
recognition!;-) And I will try #6621 right away.

Sincerely, Andrei.

On Tue, Nov 18, 2014 at 8:50 PM, Marcus Eriksson <kr...@gmail.com> wrote:
> you should stick to as small nodes as possible yes :)
>
> There are a few relevant tickets related to bootstrap and LCS:
> https://issues.apache.org/jira/browse/CASSANDRA-6621 - startup with
> -Dcassandra.disable_stcs_in_l0=true to not do STCS in L0
> https://issues.apache.org/jira/browse/CASSANDRA-7460 - (3.0) send source
> sstable level when bootstrapping
>
> On Tue, Nov 18, 2014 at 3:33 PM, Andrei Ivanov <ai...@iponweb.net> wrote:
>>
>> OK, got it.
>>
>> Actually, my problem is not that we constantly having many files at
>> L0. Normally, quite a few of them - that is, nodes are managing to
>> compact incoming writes in a timely manner.
>>
>> But it looks like when we join a new node, it receives tons of files
>> from existing nodes (and they end up at L0, right?) and that seems to
>> be where our problems start. In practice, in what I call the "old"
>> cluster, compaction became a problem at ~2TB nodes. (You, know, we are
>> trying to save something on HW - we are running on EC2 with EBS
>> volumes)
>>
>> Do I get it right that, we better stick to cmaller nodes?
>>
>>
>>
>> On Tue, Nov 18, 2014 at 5:20 PM, Marcus Eriksson <kr...@gmail.com>
>> wrote:
>> > No, they will get compacted into smaller sstables in L1+ eventually
>> > (once
>> > you have less than 32 sstables in L0, an ordinary L0 -> L1 compaction
>> > will
>> > happen)
>> >
>> > But, if you consistently get many files in L0 it means that compaction
>> > is
>> > not keeping up with your inserts and you should probably expand your
>> > cluster
>> > (or consider going back to SizeTieredCompactionStrategy for the tables
>> > that
>> > take that many writes)
>> >
>> > /Marcus
>> >
>> > On Tue, Nov 18, 2014 at 2:49 PM, Andrei Ivanov <ai...@iponweb.net>
>> > wrote:
>> >>
>> >> Marcus, thanks a lot! It explains a lot those huge tables are indeed at
>> >> L0.
>> >>
>> >> It seems that they start to appear as a result of some "massive"
>> >> operations (join, repair, rebuild). What's their fate in the future?
>> >> Will they continue to propagate like this through levels? Is there
>> >> anything that can be done to avoid/solve/prevent this?
>> >>
>> >> My fears here are around a feeling that those big tables (like in my
>> >> "old" cluster) will be hardly compactable in the future...
>> >>
>> >> Sincerely, Andrei.
>> >>
>> >> On Tue, Nov 18, 2014 at 4:27 PM, Marcus Eriksson <kr...@gmail.com>
>> >> wrote:
>> >> > I suspect they are getting size tiered in L0 - if you have too many
>> >> > sstables
>> >> > in L0, we will do size tiered compaction on sstables in L0 to improve
>> >> > performance
>> >> >
>> >> > Use tools/bin/sstablemetadata to get the level for those sstables, if
>> >> > they
>> >> > are in L0, that is probably the reason.
>> >> >
>> >> > /Marcus
>> >> >
>> >> > On Tue, Nov 18, 2014 at 2:06 PM, Andrei Ivanov <ai...@iponweb.net>
>> >> > wrote:
>> >> >>
>> >> >> Dear all,
>> >> >>
>> >> >> I have the following problem:
>> >> >> - C* 2.0.11
>> >> >> - LCS with default 160MB
>> >> >> - Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
>> >> >> - Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)
>> >> >>
>> >> >> I would expect the sstables to be of +- maximum 160MB. Despite this
>> >> >> I
>> >> >> see files like:
>> >> >> 192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
>> >> >> or
>> >> >> 631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db
>> >> >>
>> >> >> Am I missing something? What could be the reason? (Actually this is
>> >> >> a
>> >> >> "fresh" cluster - on an "old" one I'm seeing 500GB sstables). I'm
>> >> >> getting really desperate in my attempt to understand what's going
>> >> >> on.
>> >> >>
>> >> >> Thanks in advance Andrei.
>> >> >
>> >> >
>> >
>> >
>
>

Re: LCS: sstables grow larger

Posted by Marcus Eriksson <kr...@gmail.com>.

you should stick to as small nodes as possible yes :)

There are a few relevant tickets related to bootstrap and LCS:
https://issues.apache.org/jira/browse/CASSANDRA-6621 - startup
with -Dcassandra.disable_stcs_in_l0=true to not do STCS in L0
https://issues.apache.org/jira/browse/CASSANDRA-7460 - (3.0) send source
sstable level when bootstrapping

On Tue, Nov 18, 2014 at 3:33 PM, Andrei Ivanov <ai...@iponweb.net> wrote:

> OK, got it.
>
> Actually, my problem is not that we constantly having many files at
> L0. Normally, quite a few of them - that is, nodes are managing to
> compact incoming writes in a timely manner.
>
> But it looks like when we join a new node, it receives tons of files
> from existing nodes (and they end up at L0, right?) and that seems to
> be where our problems start. In practice, in what I call the "old"
> cluster, compaction became a problem at ~2TB nodes. (You, know, we are
> trying to save something on HW - we are running on EC2 with EBS
> volumes)
>
> Do I get it right that, we better stick to cmaller nodes?
>
>
>
> On Tue, Nov 18, 2014 at 5:20 PM, Marcus Eriksson <kr...@gmail.com>
> wrote:
> > No, they will get compacted into smaller sstables in L1+ eventually (once
> > you have less than 32 sstables in L0, an ordinary L0 -> L1 compaction
> will
> > happen)
> >
> > But, if you consistently get many files in L0 it means that compaction is
> > not keeping up with your inserts and you should probably expand your
> cluster
> > (or consider going back to SizeTieredCompactionStrategy for the tables
> that
> > take that many writes)
> >
> > /Marcus
> >
> > On Tue, Nov 18, 2014 at 2:49 PM, Andrei Ivanov <ai...@iponweb.net>
> wrote:
> >>
> >> Marcus, thanks a lot! It explains a lot those huge tables are indeed at
> >> L0.
> >>
> >> It seems that they start to appear as a result of some "massive"
> >> operations (join, repair, rebuild). What's their fate in the future?
> >> Will they continue to propagate like this through levels? Is there
> >> anything that can be done to avoid/solve/prevent this?
> >>
> >> My fears here are around a feeling that those big tables (like in my
> >> "old" cluster) will be hardly compactable in the future...
> >>
> >> Sincerely, Andrei.
> >>
> >> On Tue, Nov 18, 2014 at 4:27 PM, Marcus Eriksson <kr...@gmail.com>
> >> wrote:
> >> > I suspect they are getting size tiered in L0 - if you have too many
> >> > sstables
> >> > in L0, we will do size tiered compaction on sstables in L0 to improve
> >> > performance
> >> >
> >> > Use tools/bin/sstablemetadata to get the level for those sstables, if
> >> > they
> >> > are in L0, that is probably the reason.
> >> >
> >> > /Marcus
> >> >
> >> > On Tue, Nov 18, 2014 at 2:06 PM, Andrei Ivanov <ai...@iponweb.net>
> >> > wrote:
> >> >>
> >> >> Dear all,
> >> >>
> >> >> I have the following problem:
> >> >> - C* 2.0.11
> >> >> - LCS with default 160MB
> >> >> - Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
> >> >> - Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)
> >> >>
> >> >> I would expect the sstables to be of +- maximum 160MB. Despite this I
> >> >> see files like:
> >> >> 192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
> >> >> or
> >> >> 631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db
> >> >>
> >> >> Am I missing something? What could be the reason? (Actually this is a
> >> >> "fresh" cluster - on an "old" one I'm seeing 500GB sstables). I'm
> >> >> getting really desperate in my attempt to understand what's going on.
> >> >>
> >> >> Thanks in advance Andrei.
> >> >
> >> >
> >
> >
>

Re: LCS: sstables grow larger

Posted by Andrei Ivanov <ai...@iponweb.net>.

OK, got it.

Actually, my problem is not that we constantly having many files at
L0. Normally, quite a few of them - that is, nodes are managing to
compact incoming writes in a timely manner.

But it looks like when we join a new node, it receives tons of files
from existing nodes (and they end up at L0, right?) and that seems to
be where our problems start. In practice, in what I call the "old"
cluster, compaction became a problem at ~2TB nodes. (You, know, we are
trying to save something on HW - we are running on EC2 with EBS
volumes)

Do I get it right that, we better stick to cmaller nodes?



On Tue, Nov 18, 2014 at 5:20 PM, Marcus Eriksson <kr...@gmail.com> wrote:
> No, they will get compacted into smaller sstables in L1+ eventually (once
> you have less than 32 sstables in L0, an ordinary L0 -> L1 compaction will
> happen)
>
> But, if you consistently get many files in L0 it means that compaction is
> not keeping up with your inserts and you should probably expand your cluster
> (or consider going back to SizeTieredCompactionStrategy for the tables that
> take that many writes)
>
> /Marcus
>
> On Tue, Nov 18, 2014 at 2:49 PM, Andrei Ivanov <ai...@iponweb.net> wrote:
>>
>> Marcus, thanks a lot! It explains a lot those huge tables are indeed at
>> L0.
>>
>> It seems that they start to appear as a result of some "massive"
>> operations (join, repair, rebuild). What's their fate in the future?
>> Will they continue to propagate like this through levels? Is there
>> anything that can be done to avoid/solve/prevent this?
>>
>> My fears here are around a feeling that those big tables (like in my
>> "old" cluster) will be hardly compactable in the future...
>>
>> Sincerely, Andrei.
>>
>> On Tue, Nov 18, 2014 at 4:27 PM, Marcus Eriksson <kr...@gmail.com>
>> wrote:
>> > I suspect they are getting size tiered in L0 - if you have too many
>> > sstables
>> > in L0, we will do size tiered compaction on sstables in L0 to improve
>> > performance
>> >
>> > Use tools/bin/sstablemetadata to get the level for those sstables, if
>> > they
>> > are in L0, that is probably the reason.
>> >
>> > /Marcus
>> >
>> > On Tue, Nov 18, 2014 at 2:06 PM, Andrei Ivanov <ai...@iponweb.net>
>> > wrote:
>> >>
>> >> Dear all,
>> >>
>> >> I have the following problem:
>> >> - C* 2.0.11
>> >> - LCS with default 160MB
>> >> - Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
>> >> - Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)
>> >>
>> >> I would expect the sstables to be of +- maximum 160MB. Despite this I
>> >> see files like:
>> >> 192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
>> >> or
>> >> 631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db
>> >>
>> >> Am I missing something? What could be the reason? (Actually this is a
>> >> "fresh" cluster - on an "old" one I'm seeing 500GB sstables). I'm
>> >> getting really desperate in my attempt to understand what's going on.
>> >>
>> >> Thanks in advance Andrei.
>> >
>> >
>
>

Re: LCS: sstables grow larger

Posted by Marcus Eriksson <kr...@gmail.com>.

No, they will get compacted into smaller sstables in L1+ eventually (once
you have less than 32 sstables in L0, an ordinary L0 -> L1 compaction will
happen)

But, if you consistently get many files in L0 it means that compaction is
not keeping up with your inserts and you should probably expand your
cluster (or consider going back to SizeTieredCompactionStrategy for the
tables that take that many writes)

/Marcus

On Tue, Nov 18, 2014 at 2:49 PM, Andrei Ivanov <ai...@iponweb.net> wrote:

> Marcus, thanks a lot! It explains a lot those huge tables are indeed at L0.
>
> It seems that they start to appear as a result of some "massive"
> operations (join, repair, rebuild). What's their fate in the future?
> Will they continue to propagate like this through levels? Is there
> anything that can be done to avoid/solve/prevent this?
>
> My fears here are around a feeling that those big tables (like in my
> "old" cluster) will be hardly compactable in the future...
>
> Sincerely, Andrei.
>
> On Tue, Nov 18, 2014 at 4:27 PM, Marcus Eriksson <kr...@gmail.com>
> wrote:
> > I suspect they are getting size tiered in L0 - if you have too many
> sstables
> > in L0, we will do size tiered compaction on sstables in L0 to improve
> > performance
> >
> > Use tools/bin/sstablemetadata to get the level for those sstables, if
> they
> > are in L0, that is probably the reason.
> >
> > /Marcus
> >
> > On Tue, Nov 18, 2014 at 2:06 PM, Andrei Ivanov <ai...@iponweb.net>
> wrote:
> >>
> >> Dear all,
> >>
> >> I have the following problem:
> >> - C* 2.0.11
> >> - LCS with default 160MB
> >> - Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
> >> - Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)
> >>
> >> I would expect the sstables to be of +- maximum 160MB. Despite this I
> >> see files like:
> >> 192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
> >> or
> >> 631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db
> >>
> >> Am I missing something? What could be the reason? (Actually this is a
> >> "fresh" cluster - on an "old" one I'm seeing 500GB sstables). I'm
> >> getting really desperate in my attempt to understand what's going on.
> >>
> >> Thanks in advance Andrei.
> >
> >
>

Re: LCS: sstables grow larger

Posted by Andrei Ivanov <ai...@iponweb.net>.

Marcus, thanks a lot! It explains a lot those huge tables are indeed at L0.

It seems that they start to appear as a result of some "massive"
operations (join, repair, rebuild). What's their fate in the future?
Will they continue to propagate like this through levels? Is there
anything that can be done to avoid/solve/prevent this?

My fears here are around a feeling that those big tables (like in my
"old" cluster) will be hardly compactable in the future...

Sincerely, Andrei.

On Tue, Nov 18, 2014 at 4:27 PM, Marcus Eriksson <kr...@gmail.com> wrote:
> I suspect they are getting size tiered in L0 - if you have too many sstables
> in L0, we will do size tiered compaction on sstables in L0 to improve
> performance
>
> Use tools/bin/sstablemetadata to get the level for those sstables, if they
> are in L0, that is probably the reason.
>
> /Marcus
>
> On Tue, Nov 18, 2014 at 2:06 PM, Andrei Ivanov <ai...@iponweb.net> wrote:
>>
>> Dear all,
>>
>> I have the following problem:
>> - C* 2.0.11
>> - LCS with default 160MB
>> - Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
>> - Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)
>>
>> I would expect the sstables to be of +- maximum 160MB. Despite this I
>> see files like:
>> 192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
>> or
>> 631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db
>>
>> Am I missing something? What could be the reason? (Actually this is a
>> "fresh" cluster - on an "old" one I'm seeing 500GB sstables). I'm
>> getting really desperate in my attempt to understand what's going on.
>>
>> Thanks in advance Andrei.
>
>

Re: LCS: sstables grow larger

Posted by Marcus Eriksson <kr...@gmail.com>.

I suspect they are getting size tiered in L0 - if you have too many
sstables in L0, we will do size tiered compaction on sstables in L0 to
improve performance

Use tools/bin/sstablemetadata to get the level for those sstables, if they
are in L0, that is probably the reason.

/Marcus

On Tue, Nov 18, 2014 at 2:06 PM, Andrei Ivanov <ai...@iponweb.net> wrote:

> Dear all,
>
> I have the following problem:
> - C* 2.0.11
> - LCS with default 160MB
> - Compacted partition maximum bytes: 785939 (for cf/table xxx.xxx)
> - Compacted partition mean bytes: 6750 (for cf/table xxx.xxx)
>
> I would expect the sstables to be of +- maximum 160MB. Despite this I
> see files like:
> 192M Nov 18 13:00 xxx-xxx-jb-15580-Data.db
> or
> 631M Nov 18 13:03 xxx-xxx-jb-15583-Data.db
>
> Am I missing something? What could be the reason? (Actually this is a
> "fresh" cluster - on an "old" one I'm seeing 500GB sstables). I'm
> getting really desperate in my attempt to understand what's going on.
>
> Thanks in advance Andrei.
>