You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2014/04/14 23:58:15 UTC
[jira] [Commented] (CASSANDRA-5863) Create a Decompressed Chunk [block] Cache

    [ https://issues.apache.org/jira/browse/CASSANDRA-5863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968908#comment-13968908 ] 

Benedict commented on CASSANDRA-5863:
-------------------------------------

I think this ticket has been languishing despite being a great idea. I suggest the following:

# This should reside off-heap
# ALL pages should (optionally) go through it, whether compressed or not: see [my comment|https://issues.apache.org/jira/browse/CASSANDRA-6995?focusedCommentId=13968892&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13968892] here for my rationale
# We should use LIRS replacement strategy
# Initially an uncompressed only cache is probably easiest to get right, and will no doubt big a big win anyway, and we can follow up soon-after with a two-stage cache (given LZ4 is almost as fast as memory, this will no doubt improve matters further)

This will require some work to get performance on-par with mmapped io, but the resultant control we have over when we go to disk will be worth it IMO

bq. The Tricky part is tracking the "hotness" of these chunks ... This would keep things like compaction from causing churn.

I think the easiest thing to do here is to simply have a mechanism for querying the cache without populating it with new data if you have to hit the disk

> Create a Decompressed Chunk [block] Cache
> -----------------------------------------
>
>                 Key: CASSANDRA-5863
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5863
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: Pavel Yaskevich
>              Labels: performance
>             Fix For: 2.1 beta2
>
>
> Currently, for every read, the CRAR reads each compressed chunk into a byte[], sends it to ICompressor, gets back another byte[] and verifies a checksum.  
> This process is where the majority of time is spent in a read request.  
> Before compression, we would have zero-copy of data and could respond directly from the page-cache.
> It would be useful to have some kind of Chunk cache that could speed up this process for hot data. Initially this could be a off heap cache but it would be great to put these decompressed chunks onto a SSD so the hot data lives on a fast disk similar to https://github.com/facebook/flashcache.



--
This message was sent by Atlassian JIRA
(v6.2#6252)