You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Pavel Yaskevich (JIRA)" <ji...@apache.org> on 2012/12/11 08:39:22 UTC

[jira] [Comment Edited] (CASSANDRA-5020) Time to switch back to byte[] internally?

    [ https://issues.apache.org/jira/browse/CASSANDRA-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528754#comment-13528754 ] 

Pavel Yaskevich edited comment on CASSANDRA-5020 at 12/11/12 7:37 AM:
----------------------------------------------------------------------

How about we use madvice(dont_need) and mincore calls to check if file is still being used instead of waiting for GC to cleanup or copy contents on read? Basically, when file is scheduled for deletion it's page cache would be dropped and checked using mincore until 3 checks in succession return empty results, the interval could be set to 20-30 seconds as we know that we are actually waiting system to "post-process" e.g. send pre-existing buffers to client/coordinator and there is no way for new data to be read for that file.

Edit: as a second option, we could make segments mmap on-demand with WeakReference so it could be reclaimed when no longer needed, mmap call overhead is a matter for performance measurement.
                
      was (Author: xedin):
    How about we use madvice(dont_need) and mincore calls to check if file is still being used instead of waiting for GC to cleanup or copy contents on read? Basically, when file is scheduled for deletion it's page cache would be dropped and checked using mincore until 3 checks in succession return empty results, the interval could be set to 20-30 seconds as we know that we are actually waiting system to "post-process" e.g. send pre-existing buffers to client/coordinator and there is no way for new data to be read for that file.
                  
> Time to switch back to byte[] internally?
> -----------------------------------------
>
>                 Key: CASSANDRA-5020
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5020
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>
> We switched to ByteBuffer for column names and values back in 0.7, which gave us a short term performance boost on mmap'd reads, but we gave that up when we switched to refcounted sstables in 1.0.  (refcounting all the way up the read path would be too painful, so we copy into an on-heap buffer when reading from an sstable, then release the reference.)
> A HeapByteBuffer wastes a lot of memory compared to a byte[] (5 more ints, a long, and a boolean).
> The hard problem here is how to do the arena allocation we do on writes, which has been very successful in reducing STW CMS from heap fragmentation.  ByteBuffer is a good fit there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira