You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Katriel Traum <ka...@google.com> on 2014/01/23 08:13:07 UTC

Row cache vs. OS buffer cache

Hello list,

I was if anyone has any pointers or some advise regarding using row cache
vs leaving it up to the OS buffer cache.

I run cassandra 1.1 and 1.2 with JNA, so off-heap row cache is an option.

Any input appreciated.
Katriel

Re: Row cache vs. OS buffer cache

Posted by Janne Jalkanen <Ja...@ecyrd.com>.
Our experience is that you want to have all your very hot data fit in the row cache (assuming you don’t have very large rows), and leave the rest for the OS.  Unfortunately, it completely depends on your access patterns and data what is the right size for the cache - zero makes sense for a lot of cases.

Try out different sizes, and watch for row cache hit ratio and read latency. Ditto for heap sizes, btw - if your nodes are short on RAM, you may get better performance by running at lower heap sizes because OS caches will get more memory and your gc pauses will be shorter (though more numerous).

/Janne

On 23 Jan 2014, at 09:13 , Katriel Traum <ka...@google.com> wrote:

> Hello list,
> 
> I was if anyone has any pointers or some advise regarding using row cache vs leaving it up to the OS buffer cache.
> 
> I run cassandra 1.1 and 1.2 with JNA, so off-heap row cache is an option.
> 
> Any input appreciated.
> Katriel


Re: Row cache vs. OS buffer cache

Posted by Katriel Traum <ka...@google.com>.
Thank you everyone for your input.
My dataset is ~100G of size with 1 or 2 read intensive column families. The
cluster has plenty of RAM. I'll start off small with 4G of row cache and
monitor the success rate.

Katriel


On Thu, Jan 23, 2014 at 9:17 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Wed, Jan 22, 2014 at 11:13 PM, Katriel Traum <ka...@google.com>wrote:
>
>> I was if anyone has any pointers or some advise regarding using row cache
>> vs leaving it up to the OS buffer cache.
>>
>> I run cassandra 1.1 and 1.2 with JNA, so off-heap row cache is an option.
>>
>
> Many people have had bad experiences with Row Cache, I assert more than
> have had a good experience.
>
> https://issues.apache.org/jira/browse/CASSANDRA-5357
>
> Is the 2.1 era re-design of the row cache into something more conceptually
> appropriate.
>
> The rule of thumb for row cache is that if your data is :
>
> 1) very hot
> 2) very small
> 3) very uniform in size
>
> You may win with it. IMO if you meet all of those criteria you should try
> A/B the on-heap cache vs. off-heap in 1.1/1.2, especially if your cached
> rows are frequently updated.
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-5348?focusedCommentId=13794634&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13794634
>
> =Rob
>
>

Re: Row cache vs. OS buffer cache

Posted by Robert Coli <rc...@eventbrite.com>.
On Wed, Jan 22, 2014 at 11:13 PM, Katriel Traum <ka...@google.com> wrote:

> I was if anyone has any pointers or some advise regarding using row cache
> vs leaving it up to the OS buffer cache.
>
> I run cassandra 1.1 and 1.2 with JNA, so off-heap row cache is an option.
>

Many people have had bad experiences with Row Cache, I assert more than
have had a good experience.

https://issues.apache.org/jira/browse/CASSANDRA-5357

Is the 2.1 era re-design of the row cache into something more conceptually
appropriate.

The rule of thumb for row cache is that if your data is :

1) very hot
2) very small
3) very uniform in size

You may win with it. IMO if you meet all of those criteria you should try
A/B the on-heap cache vs. off-heap in 1.1/1.2, especially if your cached
rows are frequently updated.

https://issues.apache.org/jira/browse/CASSANDRA-5348?focusedCommentId=13794634&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13794634

=Rob

Re: Row cache vs. OS buffer cache

Posted by Chris Burroughs <ch...@gmail.com>.
My experience has been that the row cache is much more effective.
However, reasonable row cache sizes are so small relative to RAM that I 
don't see it as a significant trade-off unless it's in a very memory 
constrained environment.  If you want to enable the row cache (a big if) 
you probably want it to be as big as it can be until you have reached 
the point of diminishing returns on the hit rate.

The "off-heap" cache still has many on-heap objects so it's doesn't 
really change that much conceptually, you will just end up with a 
different number for the "size".

On 01/23/2014 02:13 AM, Katriel Traum wrote:
> Hello list,
>
> I was if anyone has any pointers or some advise regarding using row cache
> vs leaving it up to the OS buffer cache.
>
> I run cassandra 1.1 and 1.2 with JNA, so off-heap row cache is an option.
>
> Any input appreciated.
> Katriel
>