You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Dan Kinder <dk...@turnitin.com> on 2015/02/27 23:01:53 UTC

Less frequent flushing with LCS

Hi all,

We have a table in Cassandra where we frequently overwrite recent inserts.
Compaction does a fine job with this but ultimately larger memtables would
reduce compactions.

The question is: can we make Cassandra use larger memtables and flush less
frequently? What currently triggers the flushes? Opscenter shows them
flushing consistently at about 110MB in size, we have plenty of memory to
go larger.

According to
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_memtable_thruput_c.html
we can up the commit log space threshold, but this does not help, there is
plenty of runway there.

Theoretically sstable_size_in_mb could be causing it to flush (it's at the
default 160MB)... though we are flushing well before we hit 160MB. I have
not tried changing this but we don't necessarily want all the sstables to
be large anyway,

Thanks,
-dan

Re: Less frequent flushing with LCS

Posted by Dan Kinder <dk...@turnitin.com>.

Nope, they flush every 5 to 10 minutes.

On Mon, Mar 2, 2015 at 1:13 PM, Daniel Chia <da...@coursera.org> wrote:

> Do the tables look like they're being flushed every hour? It seems like
> the setting memtable_flush_after_mins which I believe defaults to 60
> could also affect how often your tables are flushed.
>
> Thanks,
> Daniel
>
> On Mon, Mar 2, 2015 at 11:49 AM, Dan Kinder <dk...@turnitin.com> wrote:
>
>> I see, thanks for the input. Compression is not enabled at the moment,
>> but I may try increasing that number regardless.
>>
>> Also I don't think in-memory tables would work since the dataset is
>> actually quite large. The pattern is more like a given set of rows will
>> receive many overwriting updates and then not be touched for a while.
>>
>> On Fri, Feb 27, 2015 at 2:27 PM, Robert Coli <rc...@eventbrite.com>
>> wrote:
>>
>>> On Fri, Feb 27, 2015 at 2:01 PM, Dan Kinder <dk...@turnitin.com>
>>> wrote:
>>>
>>>> Theoretically sstable_size_in_mb could be causing it to flush (it's at
>>>> the default 160MB)... though we are flushing well before we hit 160MB. I
>>>> have not tried changing this but we don't necessarily want all the sstables
>>>> to be large anyway,
>>>>
>>>
>>> I've always wished that the log message told you *why* the SSTable was
>>> being flushed, which of the various bounds prompted the flush.
>>>
>>> In your case, the size on disk may be under 160MB because compression is
>>> enabled. I would start by increasing that size.
>>>
>>> Datastax DSE has in-memory tables for this use case.
>>>
>>> =Rob
>>>
>>>
>>
>>
>> --
>> Dan Kinder
>> Senior Software Engineer
>> Turnitin – www.turnitin.com
>> dkinder@turnitin.com
>>
>
>


-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkinder@turnitin.com

Re: Less frequent flushing with LCS

Posted by Daniel Chia <da...@coursera.org>.

Do the tables look like they're being flushed every hour? It seems like the
setting memtable_flush_after_mins which I believe defaults to 60 could also
affect how often your tables are flushed.

Thanks,
Daniel

On Mon, Mar 2, 2015 at 11:49 AM, Dan Kinder <dk...@turnitin.com> wrote:

> I see, thanks for the input. Compression is not enabled at the moment, but
> I may try increasing that number regardless.
>
> Also I don't think in-memory tables would work since the dataset is
> actually quite large. The pattern is more like a given set of rows will
> receive many overwriting updates and then not be touched for a while.
>
> On Fri, Feb 27, 2015 at 2:27 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Fri, Feb 27, 2015 at 2:01 PM, Dan Kinder <dk...@turnitin.com> wrote:
>>
>>> Theoretically sstable_size_in_mb could be causing it to flush (it's at
>>> the default 160MB)... though we are flushing well before we hit 160MB. I
>>> have not tried changing this but we don't necessarily want all the sstables
>>> to be large anyway,
>>>
>>
>> I've always wished that the log message told you *why* the SSTable was
>> being flushed, which of the various bounds prompted the flush.
>>
>> In your case, the size on disk may be under 160MB because compression is
>> enabled. I would start by increasing that size.
>>
>> Datastax DSE has in-memory tables for this use case.
>>
>> =Rob
>>
>>
>
>
> --
> Dan Kinder
> Senior Software Engineer
> Turnitin – www.turnitin.com
> dkinder@turnitin.com
>

Re: Less frequent flushing with LCS

Posted by Dan Kinder <dk...@turnitin.com>.

I see, thanks for the input. Compression is not enabled at the moment, but
I may try increasing that number regardless.

Also I don't think in-memory tables would work since the dataset is
actually quite large. The pattern is more like a given set of rows will
receive many overwriting updates and then not be touched for a while.

On Fri, Feb 27, 2015 at 2:27 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Fri, Feb 27, 2015 at 2:01 PM, Dan Kinder <dk...@turnitin.com> wrote:
>
>> Theoretically sstable_size_in_mb could be causing it to flush (it's at
>> the default 160MB)... though we are flushing well before we hit 160MB. I
>> have not tried changing this but we don't necessarily want all the sstables
>> to be large anyway,
>>
>
> I've always wished that the log message told you *why* the SSTable was
> being flushed, which of the various bounds prompted the flush.
>
> In your case, the size on disk may be under 160MB because compression is
> enabled. I would start by increasing that size.
>
> Datastax DSE has in-memory tables for this use case.
>
> =Rob
>
>

-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkinder@turnitin.com

Re: Less frequent flushing with LCS

Posted by Robert Coli <rc...@eventbrite.com>.

On Fri, Feb 27, 2015 at 2:01 PM, Dan Kinder <dk...@turnitin.com> wrote:

> Theoretically sstable_size_in_mb could be causing it to flush (it's at the
> default 160MB)... though we are flushing well before we hit 160MB. I have
> not tried changing this but we don't necessarily want all the sstables to
> be large anyway,
>

I've always wished that the log message told you *why* the SSTable was
being flushed, which of the various bounds prompted the flush.

In your case, the size on disk may be under 160MB because compression is
enabled. I would start by increasing that size.

Datastax DSE has in-memory tables for this use case.

=Rob