You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Shoaib Mir <sh...@gmail.com> on 2012/04/03 01:24:10 UTC

key cache size calculation

Hi guys,

We are calculating key cache size right now. There is this column family
with ~ 100 million columns and right now we have the cache size set at 2
million.

I suspect that the active data we got is not all fitting in the 2 million
cache size and we at times are getting query execution time way higher then
the normal. Is there a limit to key cache size? I know that is all taken
from heap but how much max we can go with setting the key cache sizes?

cheers,
Shoaib

Re: key cache size calculation

Posted by Shoaib Mir <sh...@gmail.com>.
On Wed, Apr 4, 2012 at 8:04 AM, aaron morton <aa...@thelastpickle.com>wrote:

> It depends on the workload.
>
> Increase the cache size until you see the hit rate decrease, or see it
> create memory pressure. Watch the logs for messages that the caches have
> been decreased.
>
> Take a look at the Recent Read Latency for the CF. This is how long it
> takes to actually read data on that node. You can then work out the
> throughput taking into account the concurrent_readers setting in the yaml.
>
>
Thanks Aaron, I will try this.

cheers,
Shoaib

Re: key cache size calculation

Posted by aaron morton <aa...@thelastpickle.com>.
It depends on the workload. 

Increase the cache size until you see the hit rate decrease, or see it create memory pressure. Watch the logs for messages that the caches have been decreased. 

Take a look at the Recent Read Latency for the CF. This is how long it takes to actually read data on that node. You can then work out the throughput taking into account the concurrent_readers setting in the yaml. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 3/04/2012, at 4:14 PM, Shoaib Mir wrote:

> On Tue, Apr 3, 2012 at 11:49 AM, aaron morton <aa...@thelastpickle.com> wrote:
> Take a look at the key cache hit rate in nodetool cfstats. 
> 
> One approach is to increase the cache size until you do not see a matching increase in the hit rate.
> 
> 
> Thanks Aaron, what do you think will be the ideal cache hit ratio where we want this particular DB server to do around 5-6K responses per second? right now it is doing just 2-3K per second and the cache hit ratio I can see with cfstats is around the 85-90%. Do you think having a higher cache hit ratio around the 95% mark will help with getting a high throughput as well?
> 
> cheers,
> Shoaib
> 


Re: key cache size calculation

Posted by Shoaib Mir <sh...@gmail.com>.
On Tue, Apr 3, 2012 at 11:49 AM, aaron morton <aa...@thelastpickle.com>wrote:

> Take a look at the key cache hit rate in nodetool cfstats.
>
> One approach is to increase the cache size until you do not see a matching
> increase in the hit rate.
>


Thanks Aaron, what do you think will be the ideal cache hit ratio where we
want this particular DB server to do around 5-6K responses per second?
right now it is doing just 2-3K per second and the cache hit ratio I can
see with cfstats is around the 85-90%. Do you think having a higher cache
hit ratio around the 95% mark will help with getting a high throughput as
well?

cheers,
Shoaib

Re: key cache size calculation

Posted by aaron morton <aa...@thelastpickle.com>.
Take a look at the key cache hit rate in nodetool cfstats. 

One approach is to increase the cache size until you do not see a matching increase in the hit rate. 

> Is there a limit to key cache size? I know that is all taken from heap but how much max we can go with setting the key cache sizes?

It's pretty much a memory thing. 

Each entry maps a description of the SStable and the key to the offset in the index file. (off the top of my head) The SSTable description is shared, the row key, row token (16 bytes) and offset (offset) will take up space.

Cheers
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 3/04/2012, at 11:24 AM, Shoaib Mir wrote:

> Hi guys,
> 
> We are calculating key cache size right now. There is this column family with ~ 100 million columns and right now we have the cache size set at 2 million.
> 
> I suspect that the active data we got is not all fitting in the 2 million cache size and we at times are getting query execution time way higher then the normal. Is there a limit to key cache size? I know that is all taken from heap but how much max we can go with setting the key cache sizes?
> 
> cheers,
> Shoaib