You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "T Jake Luciani (JIRA)" <ji...@apache.org> on 2010/10/22 22:52:22 UTC

[jira] Created: (CASSANDRA-1651) Improve read performance by using byte array slabs

Improve read performance by using byte array slabs
--------------------------------------------------

                 Key: CASSANDRA-1651
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: T Jake Luciani
            Priority: Minor


Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.

This patch accomplishes this by and re-using sections of a larger byte array slab.

I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

T Jake Luciani updated CASSANDRA-1651:
--------------------------------------

    Attachment: 1651_v1.txt

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926430#action_12926430 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

bq. Memtables?

You're right, it could be useful for column names in a memtable for "object data" CFs that contain mostly the same columns.

How would you distinguish between these and "materialized view" CFs where the interning would burn cycles for no benefit?  (Manually configured hint is fine, just curious if you thought of a better way.)

Interning memtables is mostly (entirely?) orthogonal to how we allocate ByteBuffers though:

bq. it doesn't need to be a new buffer

I was unclear.  It needs to be a new ByteBuffer (but not a new byte[] internal to that). Otherwise you have no way to compare to your source of interned ByteBuffers.  So,

1. For both writes and reads it does matter how you create those inital ByteBuffers
2. We will want to intern column names on writes, but on reads, there is no point in dropping the initial BB for a reference to the interned one

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926397#action_12926397 ] 

T Jake Luciani commented on CASSANDRA-1651:
-------------------------------------------

The idea is to minimize allocation on intermediate byte[] copies. this will help resources and performance overall. 
I think making a copy to go into the cache is worth that benefit for non-cached reads. 

As for string interning(1255) I'm not sure how it relates to this issue.  Intern() only affects strings and when we create a string from a bytebuffer the string class makes a copy byte array to put the data in.  So would not affect anything here AFAIK.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926413#action_12926413 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

bq. if a key/name was going to live in memory for a while, we would intern it

Is there a hole in this reasoning?

 1. the only place we have these live in memory for a while is the key and row caches; everything else only lasts for the duration of a read which is small numbers of ms at most (i.e. less than the time between new or old gen GC runs)
 2. if you are reading the same sequence multiple times then you should be using a key/row cache; it is silly to discuss interning w/o this
 3. we have to read it (or wrap it) the sequence into a new buffer to do the cache lookup; at this point the allocation has been done and replacing it with an interned version doesn't matter since GCing now vs GCing when the read finishes in a ms is going to be part of the same GC generation

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926359#action_12926359 ] 

T Jake Luciani commented on CASSANDRA-1651:
-------------------------------------------

That's a good point. I suppose before it goes into the row cache we could copy it into a stand alone array via .get(byte[])?

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925832#action_12925832 ] 

T Jake Luciani commented on CASSANDRA-1651:
-------------------------------------------

Regarding 1)

https://svn.apache.org/viewvc/lucene/dev/trunk/lucene/src/java/org/apache/lucene/util/RecyclingByteBlockAllocator.java?view=markup

This class has a recycleByteBlocks() call that requires the caller to inform this class when buffers are no longer in use.
Since we don't have the ability to do this now, it's not a good fit to me.


One other approach I'm trying is to used WeakHashRefs and ReferenceQueues to track when a buffer is ready to be GCd and re-use it then.


> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925824#action_12925824 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

Before committing this I want to

 1. evaluate the Lucene allocator
 2. find where the point of diminishing returns is (I suspect fairly low) and set slab size to that rather than exposing a tunable

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925827#action_12925827 ] 

T Jake Luciani commented on CASSANDRA-1651:
-------------------------------------------

Regarding 2)  I still think this should be tunable since if a workload has column with very large data then they may want to increase the slab side to fit this, otherwise they can't take advantage of this.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926371#action_12926371 ] 

Stu Hood commented on CASSANDRA-1651:
-------------------------------------

I wonder if we shouldn't look into 1255 before adding too many special cases here?

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

T Jake Luciani resolved CASSANDRA-1651.
---------------------------------------

    Resolution: Duplicate

CASSANDRA-1714

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Ryan King (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924028#action_12924028 ] 

Ryan King commented on CASSANDRA-1651:
--------------------------------------

If we're going to expose the tunable for the size of the slab, we should also measure its effectiveness. Otherwise you'd often be tuning in the dark.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1651:
--------------------------------------


it looks like this is only addressing the shortbytearray (row keys and column names) parts, but not column values?  column values can be larger but in most workloads the majority will fit in a slab.

also, it seems to me that for reading from the disk or network the "right" way is to wrap segments of the buffer the reader is filling, rather than fill read buffer, then copy to slab buffer a second time.


> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926424#action_12926424 ] 

Stu Hood commented on CASSANDRA-1651:
-------------------------------------

> 1. ... everything else only lasts for the duration of a read which is small numbers of ms at most
> 2. ... so we can dismiss this scenario as uninteresting
Memtables?

> 3. we have to read it (or wrap it) the sequence into a new buffer to do the cache lookup
No, it doesn't need to be a new buffer. In fact, it can be the same buffer for every single key/name/value that is read from disk.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925840#action_12925840 ] 

T Jake Luciani commented on CASSANDRA-1651:
-------------------------------------------

Leave it out or just don't show it in cassandra.yaml and set the default (would work that way now if you removed the yaml entry)

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1651:
--------------------------------------

    Fix Version/s: 0.7.1
         Assignee: T Jake Luciani

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926403#action_12926403 ] 

Stu Hood commented on CASSANDRA-1651:
-------------------------------------

> As for string interning(1255)
That ticket doesn't say anything about strings.

I mentioned it here because if a key/name was going to live in memory for a while, we would intern it, which would give us a clear boundary on which to perform the ByteBuffer copy you mention. The slabs would still be used to perform the initial interning lookup, and we would still need an explicit copies for cached values.

> Maybe we should just disable it where the row cache is enabled
Note that this problem applies to the key-cache as well.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925223#action_12925223 ] 

Jonathan Ellis edited comment on CASSANDRA-1651 at 10/26/10 8:41 PM:
---------------------------------------------------------------------

Can we improve FileDataInput to have a getBytes() method that in the mmap case just performs a wrap?  DirectByteBuffer ftw :)

EDIT: let's make that a separate ticket.

      was (Author: jbellis):
    Can we improve FileDataInput to have a getBytes() method that in the mmap case just performs a wrap?  DirectByteBuffer ftw :)
  
> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Ryan King (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924042#action_12924042 ] 

Ryan King commented on CASSANDRA-1651:
--------------------------------------

Also, Lucene has a RecyclingByteBlockAllocator, that we might want to just steal: https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/core/org/apache/lucene/util/RecyclingByteBlockAllocator.html

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924064#action_12924064 ] 

T Jake Luciani edited comment on CASSANDRA-1651 at 10/22/10 8:52 PM:
---------------------------------------------------------------------

Right, this was simply an attempt to get this started. I'm happy to keep going.

Also, the patch also works with FBUtilities.readByteArray() which is used for column deserialization.

      was (Author: tjake):
    Right, this was simply an attempt to get this started. I'm happy to keep going.
  
> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924064#action_12924064 ] 

T Jake Luciani commented on CASSANDRA-1651:
-------------------------------------------

Right, this was simply an attempt to get this started. I'm happy to keep going.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926320#action_12926320 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

Isn't this dangerous with row cache, where each cached row (in the worst case) now keeps a 1MB slab (for instance) from being GCd?

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926409#action_12926409 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

bq. The idea is to minimize allocation on intermediate byte[] copies

once you have the ByteBuffer you're good to go there whether you created it via wrap or allocate.  the only question is whether it's easier to write something like

{code}
if cfs.row_cache_enabled:
  allocate
else:
  wrap
{code}

or to deep-copy the CF to allocate()ed BB after the fact.  The first is going to be more performant (wrap + allocate later vs allocate initially) so the question is can we make the code sane.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925836#action_12925836 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

bq. Regarding 2) I still think this should be tunable since if a workload has column with very large data then they may want to increase the slab side to fit this, otherwise they can't take advantage of this. 

I think we should leave it out until/unless we actually find such a case.  I suspect that the byte[] allocation is not going to be the major source of overhead for workloads like that, so YAGNI.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

T Jake Luciani updated CASSANDRA-1651:
--------------------------------------

    Attachment:     (was: 1651_v1.patch)

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924035#action_12924035 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

it would make sense to me for slab size to be ColumnIndexSizeInKB, since that's the unit we deserialize in for many queries

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924053#action_12924053 ] 

Stu Hood commented on CASSANDRA-1651:
-------------------------------------

I'm not sure how I feel about slab allocation... we're setting ourselves up to deal with fragmentation in slabs, etc. Pools of ByteBuffers (similar to Lucene) would be my preference.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926413#action_12926413 ] 

Jonathan Ellis edited comment on CASSANDRA-1651 at 10/29/10 3:21 PM:
---------------------------------------------------------------------

bq. if a key/name was going to live in memory for a while, we would intern it

Is there a hole in this reasoning?

 1. the only place we have these live in memory for a while is the key and row caches; everything else only lasts for the duration of a read which is small numbers of ms at most (i.e. less than the time between new or old gen GC runs)
 2. if you are reading the same sequence multiple times without using key/row cache then you are not configuring optimally so we can dismiss this scenario as uninteresting
 3. we have to read it (or wrap it) the sequence into a new buffer to do the cache lookup; at this point the allocation has been done and replacing it with an interned version doesn't matter since GCing now vs GCing when the read finishes in a ms is going to be part of the same GC generation

      was (Author: jbellis):
    bq. if a key/name was going to live in memory for a while, we would intern it

Is there a hole in this reasoning?

 1. the only place we have these live in memory for a while is the key and row caches; everything else only lasts for the duration of a read which is small numbers of ms at most (i.e. less than the time between new or old gen GC runs)
 2. if you are reading the same sequence multiple times then you should be using a key/row cache; it is silly to discuss interning w/o this
 3. we have to read it (or wrap it) the sequence into a new buffer to do the cache lookup; at this point the allocation has been done and replacing it with an interned version doesn't matter since GCing now vs GCing when the read finishes in a ms is going to be part of the same GC generation
  
> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924064#action_12924064 ] 

T Jake Luciani edited comment on CASSANDRA-1651 at 10/22/10 8:53 PM:
---------------------------------------------------------------------

Right, this was simply an attempt to get this started. I'm happy to keep going.

Also, the patch does work with FBUtilities.readByteArray() which is used for column deserialization.

      was (Author: tjake):
    Right, this was simply an attempt to get this started. I'm happy to keep going.

Also, the patch also works with FBUtilities.readByteArray() which is used for column deserialization.
  
> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926364#action_12926364 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

Maybe we should just disable it where the row cache is enabled.  If you have a high hit rate then allocation doesn't matter.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-1651:
--------------------------------

    Comment: was deleted

(was: I'm not sure how I feel about slab allocation... we're setting ourselves up to deal with fragmentation in slabs, etc. Pools of ByteBuffers (similar to Lucene) would be my preference.)

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924058#action_12924058 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

bq. the "right" way is to wrap segments of the buffer the reader is filling

admittedly this is a much bigger change and should probably be split into another ticket.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925223#action_12925223 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

Can we improve FileDataInput to have a getBytes() method that in the mmap case just performs a wrap?  DirectByteBuffer ftw :)

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928763#action_12928763 ] 

Jonathan Ellis commented on CASSANDRA-1651:
-------------------------------------------

created CASSANDRA-1714 which is a more complete approach to the problem this is trying to address, but also a more involved one.

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Minor
>             Fix For: 0.7.1
>
>         Attachments: 1651_v1.txt
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1651) Improve read performance by using byte array slabs

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

T Jake Luciani updated CASSANDRA-1651:
--------------------------------------

    Attachment: 1651_v1.patch

> Improve read performance by using byte array slabs
> --------------------------------------------------
>
>                 Key: CASSANDRA-1651
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1651
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>            Priority: Minor
>         Attachments: 1651_v1.patch
>
>
> Now that the code has switched to byte buffers internally,  it should be possible to improve read performance by reducing the number of byte array allocations.
> This patch accomplishes this by and re-using sections of a larger byte array slab.
> I've benchmarked it locally and seen a slight improvement on reads, a larger scale benchmark should be performed.
> Also the size of a slab can be configured in cassandra.yaml

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.