You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2010/02/16 20:15:28 UTC
[jira] Created: (CASSANDRA-799) memtable sort is the bottleneck for
range query performance
memtable sort is the bottleneck for range query performance
-----------------------------------------------------------
Key: CASSANDRA-799
URL: https://issues.apache.org/jira/browse/CASSANDRA-799
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
Fix For: 0.6
The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834462#action_12834462 ]
Stu Hood commented on CASSANDRA-799:
------------------------------------
> hence the TODO there.
Missed that... apologies.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834916#action_12834916 ]
Stu Hood commented on CASSANDRA-799:
------------------------------------
It seems weird to me that a Memtable removes itself from the list of memtables in the CFS, which is a symptom of some very tight coupling.
Shouldn't a CFS own a Memtable, give it a handle to an open SSTableWriter to flush itself, close the writer when it is done, and remove the Memtable? The easiest way to accomplish this would be to move flushAndSignal to CFS, and change writeSortedContents to a void on Flushable that takes an open SSTableWriter. Then you could probably remove the Memtable reference to the CFS entirely.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834934#action_12834934 ]
Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------
stu, there is tension here between the goals of "decouple" and "encapsulate" -- either CFS cheats and does part of the work that should be in MT/BMT, special casing as necessary, the way we had it before, or we let MT r/m itself from the pending list.
since MT is already reaching in to CFS in places (this is far from the only place, check again -- the way it was before it was looking up CFS by table/cf name, which isn't really decoupled, it's just sloppy) I think I prefer the latter since then at least we are being consistent.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834525#action_12834525 ]
Brandon Williams commented on CASSANDRA-799:
--------------------------------------------
FLUSH-SORTER-POOL and MEMTABLE-POST-FLUSHER seems to both stack equally.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-799) memtable sort is the bottleneck for
range query performance
Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stu Hood updated CASSANDRA-799:
-------------------------------
Attachment: 799-example.diff
This is what I had in mind, although I managed to break a testcase.
Honestly, I want to get your fix in so we can branch 0.6... I'm not really concerned about which solution we go with.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-example.diff, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834430#action_12834430 ]
Brandon Williams commented on CASSANDRA-799:
--------------------------------------------
Performance was great with OPP, I saw no penalties with inserts/reads and get_range_slice was as fast as it is without this patch and all tables flushed. However the RP began giving me timeouts with stress.py as memtables were backing up for flushing.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834449#action_12834449 ]
Stu Hood commented on CASSANDRA-799:
------------------------------------
Using getSortedKeys followed by writeSortedContents is n*log(n) rather than n... I know that isn't the bottleneck, but we might as well do this right.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-799) memtable sort is the bottleneck for
range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-799:
-------------------------------------
Attachment: 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt
0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt
0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834503#action_12834503 ]
Brandon Williams commented on CASSANDRA-799:
--------------------------------------------
Unfortunately I'm not seeing any difference with the revised patch.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis resolved CASSANDRA-799.
--------------------------------------
Resolution: Fixed
committed polymorphic version -- having MT only use the write executor, but BMT needing both, is too much if/elsing for my taste in CFS. thanks for taking a stab at that though, stu.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-example.diff, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834451#action_12834451 ]
Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------
hence the TODO there.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834534#action_12834534 ]
Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------
something is weird, sort should be a no-op with this patch
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834920#action_12834920 ]
Brandon Williams commented on CASSANDRA-799:
--------------------------------------------
Performance is still great with the new patchset.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834849#action_12834849 ]
Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------
todo-free patchset attached.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-799) memtable sort is the bottleneck for
range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-799:
-------------------------------------
Attachment: 799.txt
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834456#action_12834456 ]
Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------
Brandon, can you try reverting the change I sneaked in to CFS.flushWriter_ (capping the flush write queue) and see if that makes the timeouts go away under RP?
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834508#action_12834508 ]
Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------
which executor specifically is backing up?
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834936#action_12834936 ]
Jonathan Ellis commented on CASSANDRA-799:
------------------------------------------
happy to take a look at your alternate patch tho to see how it works.
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835180#action_12835180 ]
Hudson commented on CASSANDRA-799:
----------------------------------
Integrated in Cassandra #360 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/360/])
refactor to make memtablesPendingFlush a member variable instead of a static, and Memtable to have a reference to CFS instead of table/cfname pair.
patch by jbellis; reviewed by Stu Hood for
use a sorted map for memtable contents to make range queries not have to sort every time
patch by jbellis; reviewed by Stu Hood for
refactor IFlushable contract to push differences b/t Mt and BMT into their respective classes
patch by jbellis; reviewed by Stu Hood for
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 0001-refactor-IFlushable-contract-to-push-differences-b-t-M.txt, 0002-use-a-sorted-map-for-memtable-contents-to-make-range-q.txt, 0003-refactor-to-make-memtablesPendingFlush-a-member-variab.txt, 799-example.diff, 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-799) memtable sort is the bottleneck
for range query performance
Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834547#action_12834547 ]
Brandon Williams commented on CASSANDRA-799:
--------------------------------------------
Turns out I haven't done single node benchmarks recently enough; r906627 is to blame for making things 'too fast' and thus causing too much backpressure. This patch is fine. +1
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-799) memtable sort is the bottleneck for
range query performance
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-799:
-------------------------------------
Attachment: 799-unbounded-flushwriter.txt
> memtable sort is the bottleneck for range query performance
> -----------------------------------------------------------
>
> Key: CASSANDRA-799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-799
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 0.6
>
> Attachments: 799-unbounded-flushwriter.txt, 799.txt
>
>
> The obvious remedy is to use a sorted map. Unfortunately, keeping the map sorted constantly w/ TreeMap was about 30% slower than HashMap + sort back when we were doing manual locking. Let's see what the overhead is for ConcurrentSkiplistMap vs NBHM.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.