You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Robert Coli (JIRA)" <ji...@apache.org> on 2013/10/15 01:36:43 UTC
[jira] [Commented] (CASSANDRA-5348) Remove on-heap row cache

    [ https://issues.apache.org/jira/browse/CASSANDRA-5348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794634#comment-13794634 ] 

Robert Coli commented on CASSANDRA-5348:
----------------------------------------

I understand and agree with the idea of removing the Row Cache as likely to be hazardous to most users.

I do not, however, understand removing the on-heap Row Cache and keeping the off-heap one.

Problems with on-heap cache :

1) if you make it too big, you consume too much heap

Problems with off-heap cache :

1) still consumes heap despite being off-heap, including marginal heap on each read/write
2) serialize-deserialize penalty on read/write
3) invalidates on write

Other than the fact that the on-heap is more likely to cause you problems by running out of heap if it is too large, it seems on its face to be a better implementation of the row cache concept than the off-heap row cache. If we already accept that the Row Cache is for use by people who know what they are doing... aren't those users likely to actually prefer the on-heap cache, especially in 2.0 where heap pressure is the least severe it has ever been? Is there something I'm missing about what makes the on-heap cache so bad?

tl;dr : I +1 sylvain's comments above, but with some questions re on-heap vs. off-heap.

> Remove on-heap row cache
> ------------------------
>
>                 Key: CASSANDRA-5348
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5348
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 2.0 beta 1
>
>         Attachments: 5348.txt
>
>
> The row (partition) cache easily does more harm than good.  People expect it to act like a query cache but it is very different than that, especially for the wide partitions that are so common in Cassandra data models.
> Making it off-heap by default only helped a little; we still have to deserialize the partition to the heap to query it.
> Ultimately we can add a better cache based on the ideas in CASSANDRA-1956 or CASSANDRA-2864, but even if we don't get to that until 2.1, removing the old row cache for 2.0 is a good idea.



--
This message was sent by Atlassian JIRA
(v6.1#6144)