You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2010/02/16 20:15:28 UTC

[jira] Created: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

memtable sort is the bottleneck for range query performance
-----------------------------------------------------------

                 Key: CASSANDRA-799
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Jonathan Ellis
            Assignee: Jonathan Ellis
            Priority: Minor
             Fix For: 0.6


The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834462#action_12834462 ] 

Stu Hood commented on CASSANDRA-799:
------------------------------------

> hence the TODO there.
Missed that... apologies.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834916#action_12834916 ] 

Stu Hood commented on CASSANDRA-799:
------------------------------------

It seems weird to me that a Memtable removes itself from the list of memtables in the CFS, which is a symptom of some very tight coupling.

Shouldn't a CFS own a Memtable, give it a handle to an open SSTableWriter to flush itself, close the writer when it is done, and remove the Memtable? The easiest way to accomplish this would be to move flushAndSignal to CFS, and change writeSortedContents to a void on Flushable that takes an open SSTableWriter. Then you could probably remove the Memtable reference to the CFS entirely.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834934#action_12834934 ] 

Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------

stu, there is tension here between the goals of "decouple" and "encapsulate" -- either CFS cheats and does part of the work that should be in MT/BMT, special casing as necessary, the way we had it before, or we let MT r/m itself from the pending list.

since MT is already reaching in to CFS in places (this is far from the only place, check again -- the way it was before it was looking up CFS by table/cf name, which isn't really decoupled, it's just sloppy) I think I prefer the latter since then at least we are being consistent.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834525#action_12834525 ] 

Brandon Williams commented on CASSANDRA-799:
--------------------------------------------

FLUSH-SORTER-POOL and MEMTABLE-POST-FLUSHER seems to both stack equally.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-799:
-------------------------------

    Attachment: 799-example.diff

This is what I had in mind, although I managed to break a testcase.

Honestly, I want to get your fix in so we can branch 0.6... I'm not really concerned about which solution we go with.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-example.diff, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834430#action_12834430 ] 

Brandon Williams commented on CASSANDRA-799:
--------------------------------------------

Performance was great with OPP, I saw no penalties with inserts/reads and get_range_slice was as fast as it is without this patch and all tables flushed.  However the RP began giving me timeouts with stress.py as memtables were backing up for flushing.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834449#action_12834449 ] 

Stu Hood commented on CASSANDRA-799:
------------------------------------

Using getSortedKeys followed by  writeSortedContents is n*log(n) rather than n... I know that isn't the bottleneck, but we might as well do this right.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-799:
-------------------------------------

    Attachment: 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt
                0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt
                0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834503#action_12834503 ] 

Brandon Williams commented on CASSANDRA-799:
--------------------------------------------

Unfortunately I'm not seeing any difference with the revised patch.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-799.
--------------------------------------

    Resolution: Fixed

committed polymorphic version -- having MT only use the write executor, but BMT needing both, is too much if/elsing for my taste in CFS.  thanks for taking a stab at that though, stu.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-example.diff, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834451#action_12834451 ] 

Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------

hence the TODO there.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834534#action_12834534 ] 

Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------

something is weird, sort should be a no-op with this patch

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834920#action_12834920 ] 

Brandon Williams commented on CASSANDRA-799:
--------------------------------------------

Performance is still great with the new patchset.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834849#action_12834849 ] 

Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------

todo-free patchset attached.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-799:
-------------------------------------

    Attachment: 799.txt

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834456#action_12834456 ] 

Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------

Brandon, can you try reverting the change I sneaked in to CFS.flushWriter_ (capping the flush write queue) and see if that makes the timeouts go away under RP?

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834508#action_12834508 ] 

Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------

which executor specifically is backing up?

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834936#action_12834936 ] 

Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------

happy to take a look at your alternate patch tho to see how it works.

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835180#action_12835180 ] 

Hudson commented on CASSANDRA-799:
----------------------------------

Integrated in Cassandra #360 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/360/])
    refactor to make memtablesPendingFlush a member variable instead of a static, and Memtable to have a reference to CFS instead of table/cfname pair.
patch by jbellis; reviewed by Stu Hood for 
use a sorted map for memtable contents to make range queries not have to sort every time
patch by jbellis; reviewed by Stu Hood for 
refactor IFlushable contract to push differences b/t Mt and BMT into their respective classes
patch by jbellis; reviewed by Stu Hood for 


> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-example.diff, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834547#action_12834547 ] 

Brandon Williams commented on CASSANDRA-799:
--------------------------------------------

Turns out I haven't done single node benchmarks recently enough; r906627 is to blame for making things 'too fast' and thus causing too much backpressure.  This patch is fine.  +1

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-799) memtable sort is the bottleneck for range query performance

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-799:
-------------------------------------

    Attachment: 799-unbounded-flushwriter.txt

> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-799
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-799
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map.  Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking.  Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.