You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2009/09/01 20:52:32 UTC

[jira] Created: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Pool BufferedRandomAccessFile objects used by sstable reads
-----------------------------------------------------------

                 Key: CASSANDRA-408
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
             Project: Cassandra
          Issue Type: New Feature
            Reporter: Jonathan Ellis
            Assignee: Jonathan Ellis
             Fix For: 0.5
         Attachments: 408.patch, commons-pool-1.5.2.jar

not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.

this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Attachment:     (was: 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt)

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Attachment:     (was: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt)

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Fix Version/s:     (was: 0.5)
                   0.9

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795749#action_12795749 ] 

Hudson commented on CASSANDRA-408:
----------------------------------

Integrated in Cassandra #310 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/310/])
    productize mmap approach: handle files > 2GB by chunking w/ fallback to BRAF
when a row crosses chunk boundaries (you don't want to have to check for crossing
boundary in each read() call, or you'll almost certainly waste more time than
the BRAF approach); add retrying-delete to wait for mmapped files to be unmapped
by finalizer after compaction

patch by jbellis; reviewed by Brandon Williams and goffinet for 
Implement FileDataInput with MappedFileDataInput, backed by a mmap'd ByteBuffer.
patch by jbellis; reviewed by Brandon Williams and goffinet for 
add FileDataInput, implemented by BufferedRandomAccessFile
patch by jbellis; reviewed by Brandon Williams and goffinet for 


> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Attachment:     (was: 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt)

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reassigned CASSANDRA-408:
----------------------------------------

    Assignee: Chris Goffinet  (was: Jonathan Ellis)

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.5
>
>         Attachments: 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794825#action_12794825 ] 

Jonathan Ellis commented on CASSANDRA-408:
------------------------------------------

Using mmapped files instead should be both simpler and higher-performance.  (_Should_.  Testing required.)

We already have most of the plumbing needed w/ our phantomreferences on sstablereaders to be able to wait for GC of the MappedByteBuffer before deleting a compacted sstable, which is the major problem w/ the mmap approach.

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Attachment: 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt
                0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt
                0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750499#action_12750499 ] 

Jonathan Ellis commented on CASSANDRA-408:
------------------------------------------

this results in not one pooled reader per thread but one pooled reader per sstable per thread.  this would be bad in pathological cases like digg's 1200 sstables post-bulk-load-pre-compaction.  need to rethink the approach here.

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Attachment:     (was: 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt)

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Attachment:     (was: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt)

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Attachment: commons-pool-1.5.2.jar
                408.patch

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795324#action_12795324 ] 

Jonathan Ellis commented on CASSANDRA-408:
------------------------------------------

Brandon reports 40% speed gains in a real test scenario (quad core server + clients on another machine)

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-408.
--------------------------------------

    Resolution: Fixed
      Assignee: Jonathan Ellis  (was: Chris Goffinet)

committed

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795229#action_12795229 ] 

Jonathan Ellis commented on CASSANDRA-408:
------------------------------------------

Patches attached to perform mmap-backed reads.  Crappy testing on my laptop shows about a 15% speed increase w/ stress.py reads.

Old read path is still around, primarily for use on 32bit systems.  Old path is also used on rows that cross the 2GB boundaries that the JVM lets us map at a single time (boo!) and for compactions.

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Attachment: 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt
                0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt
                0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Attachment:     (was: 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt)

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Attachment: 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt
                0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt
                0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795425#action_12795425 ] 

Jonathan Ellis commented on CASSANDRA-408:
------------------------------------------

updated to close temporary RandomAccessFile objects

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-408:
-------------------------------------

    Component/s: Core

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795451#action_12795451 ] 

Jonathan Ellis commented on CASSANDRA-408:
------------------------------------------

fixed excessively slow assert statement

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-408) Pool BufferedRandomAccessFile objects used by sstable reads

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795679#action_12795679 ] 

Chris Goffinet commented on CASSANDRA-408:
------------------------------------------

+1

> Pool BufferedRandomAccessFile objects used by sstable reads
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-408
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-408
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Chris Goffinet
>             Fix For: 0.9
>
>         Attachments: 0001-add-FileDataInput-implemented-by-BufferedRandomAccessF.txt, 0002-Implement-FileDataInput-with-MappedFileDataInput-backe.txt, 0003-productize-mmap-approach-handle-files-2GB-by-chunking-.txt, 408.patch, commons-pool-1.5.2.jar
>
>
> not only does BRAF per op do a whole lot of extra fopens, but the buffering actually makes it _more_ expensive to set up since on the jvm all primitive arrays are initialized to zero.
> this adds a simple read test to stress.py; I'm seeing about a 10% increase in throughput which is worth 200loc imo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.