You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2012/06/13 23:03:44 UTC

[jira] [Created] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

Jonathan Ellis created CASSANDRA-4338:
-----------------------------------------

             Summary: Experiment with direct buffer in SequentialWriter
                 Key: CASSANDRA-4338
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Jonathan Ellis
            Assignee: Yuki Morishita
            Priority: Minor
             Fix For: 1.2


Using a direct buffer instead of a heap-based byte[] should let us avoid a copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

Posted by "Yuki Morishita (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuki Morishita updated CASSANDRA-4338:
--------------------------------------

    Attachment: 4338-gc.tar.gz

LCS with sstable_size_in_mb set to 1, I ran stress test again (n=20,000,000) and get the gc log for both patched and trunk(logs attached).

GC log setting is below:
{code}
# GC logging options -- uncomment to enable
JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails"
JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure"
JVM_OPTS="$JVM_OPTS -XX:PrintFLSStatistics=1"
JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date +%s`.log"
{code}

So far, both logs show no indication of promotion failure.

bq. we might need to look at using our "cleaner" hack to free the direct buffers, or use a buffer based on FreeableMemory, to avoid the phantomreference crap that DirectBuffer normally inflicts on GC

My current implementation does not free Direct ByteBuffer. Since it costs some time to allocate direct BB, it allocates large buffer at start up, slice it to certain block size, and pool them to reuse. So I don't think switching to FreeableMemory or Unsafe contribute improvement here.

I will start looking at the reading side (RAR/CRAR) and see what we can do there.
                
> Experiment with direct buffer in SequentialWriter
> -------------------------------------------------
>
>                 Key: CASSANDRA-4338
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Yuki Morishita
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439010#comment-13439010 ] 

Jonathan Ellis commented on CASSANDRA-4338:
-------------------------------------------

Any difference in cpu usage with the direct buffer patch?  If we're not maxing out CPU then it wouldn't necessarily run faster even if it's more efficient.
                
> Experiment with direct buffer in SequentialWriter
> -------------------------------------------------
>
>                 Key: CASSANDRA-4338
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Yuki Morishita
>            Priority: Minor
>             Fix For: 1.2.0
>
>         Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295771#comment-13295771 ] 

Jonathan Ellis commented on CASSANDRA-4338:
-------------------------------------------

Using direct buffers for RAR and CRAR may also help avoid heap fragmentation.
                
> Experiment with direct buffer in SequentialWriter
> -------------------------------------------------
>
>                 Key: CASSANDRA-4338
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Yuki Morishita
>            Priority: Minor
>             Fix For: 1.2
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-4338:
--------------------------------------

    Fix Version/s:     (was: 1.2.0)
    
> Experiment with direct buffer in SequentialWriter
> -------------------------------------------------
>
>                 Key: CASSANDRA-4338
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Yuki Morishita
>            Priority: Minor
>         Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

Posted by "Yuki Morishita (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuki Morishita updated CASSANDRA-4338:
--------------------------------------

    Attachment: gc-trunk.png
                gc-4338-patched.png

So I implemented SequentialWriter with DirectBB, and ran test on it. Code is pushed to here: https://github.com/yukim/cassandra/tree/4338-3

Test: 'cassandra-stress -Z LeveledCompactionStrategy -n 20000000' to empty trunk and patched node.

Attached two screenshots show gc stats. Unfortunately I cannot see significant difference between two. Any suggestions on implementation/testing?
                
> Experiment with direct buffer in SequentialWriter
> -------------------------------------------------
>
>                 Key: CASSANDRA-4338
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Yuki Morishita
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: gc-4338-patched.png, gc-trunk.png
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401848#comment-13401848 ] 

Jonathan Ellis commented on CASSANDRA-4338:
-------------------------------------------

I'd vote for:

- test with LCS, with/without compression (maybe even reduce sstable size to 1MB to really stress sstable creation)
- enable gc logging, count promotion failures so we have quantitative data (if we see zero both ways, we may need a more complex test)
- if instead we see nonzero promotion failures both ways, at about the same rate, we might need to look at using our "cleaner" hack to free the direct buffers, or use a buffer based on FreeableMemory, to avoid the phantomreference crap that DirectBuffer normally inflicts on GC
                
> Experiment with direct buffer in SequentialWriter
> -------------------------------------------------
>
>                 Key: CASSANDRA-4338
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Yuki Morishita
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: gc-4338-patched.png, gc-trunk.png
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira