You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Johan Oskarsson (JIRA)" <ji...@apache.org> on 2009/09/03 11:04:32 UTC

[jira] Created: (CASSANDRA-420) Improve performance of BinaryMemtable sort phase

Improve performance of BinaryMemtable sort phase
------------------------------------------------

                 Key: CASSANDRA-420
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-420
             Project: Cassandra
          Issue Type: Improvement
    Affects Versions: 0.5
            Reporter: Johan Oskarsson
            Priority: Minor
             Fix For: 0.5


The BinaryMemtable sorts an array of decorated keys. There are a lot of string operations and object allocation in the comparator that could be avoided to improve performance..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-420) Improve performance of BinaryMemtable sort phase

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751475#action_12751475 ] 

Jonathan Ellis commented on CASSANDRA-420:
------------------------------------------

Can you extend this to support the other partitioners as well?

Probably the easiest way is to make DK a (Token decoration, String key) tuple, since the different Token subclasses already wrap the different decorations (BigInteger, byte[]).  May or may not be worth special-casing OPP (whose StringToken is just a wrapper around the key in question) to have null decoration.

> Improve performance of BinaryMemtable sort phase
> ------------------------------------------------
>
>                 Key: CASSANDRA-420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-420
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: CASSANDRA-420.patch
>
>
> The BinaryMemtable sorts an array of decorated keys. There are a lot of string operations and object allocation in the comparator that could be avoided to improve performance..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-420) Improve performance of BinaryMemtable sort phase

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12755620#action_12755620 ] 

Jonathan Ellis commented on CASSANDRA-420:
------------------------------------------

Created CASSANDRA-446 to use DK even more.

> Improve performance of BinaryMemtable sort phase
> ------------------------------------------------
>
>                 Key: CASSANDRA-420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-420
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: CASSANDRA-420.patch, CASSANDRA-420.patch
>
>
> The BinaryMemtable sorts an array of decorated keys. There are a lot of string operations and object allocation in the comparator that could be avoided to improve performance..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-420) Improve performance of BinaryMemtable sort phase

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated CASSANDRA-420:
--------------------------------------

    Attachment: CASSANDRA-420.patch

Updated patch with the suggestions. Have not had a chance to do a full benchmark on this one yet as I'm having some unrelated issues. Uploading anyway so people can have a go if they so wish.

> Improve performance of BinaryMemtable sort phase
> ------------------------------------------------
>
>                 Key: CASSANDRA-420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-420
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: CASSANDRA-420.patch, CASSANDRA-420.patch
>
>
> The BinaryMemtable sorts an array of decorated keys. There are a lot of string operations and object allocation in the comparator that could be avoided to improve performance..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-420) Improve performance of BinaryMemtable sort phase

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756009#action_12756009 ] 

Hudson commented on CASSANDRA-420:
----------------------------------

Integrated in Cassandra #199 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/199/])
    Use DecoratedKey objects in BMt to avoid lots of expensive String parsing operations when sorting. patch by johano; reviewed by jbellis for 


> Improve performance of BinaryMemtable sort phase
> ------------------------------------------------
>
>                 Key: CASSANDRA-420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-420
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: CASSANDRA-420.patch, CASSANDRA-420.patch
>
>
> The BinaryMemtable sorts an array of decorated keys. There are a lot of string operations and object allocation in the comparator that could be avoided to improve performance..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-420) Improve performance of BinaryMemtable sort phase

Posted by "Johan Oskarsson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Johan Oskarsson updated CASSANDRA-420:
--------------------------------------

    Attachment: CASSANDRA-420.patch

This patch changes the decorated keys to be stored as an object with a BigInteger and a String as member variables instead of both values in a String. This means we can avoid a lot of the heavy lifting in the comparator. 

In a non scientific mini benchmark the sorting phase takes an order of magnitude shorter time with the patch applied. I also see double throughput per node when loading data from Hadoop.

The patch needs a bit more work, comments etc but as per IRC discussion I am putting it up so others can weigh in. Should we start using the DecoratedKey class or a version thereof more extensively instead of the String we use now?

> Improve performance of BinaryMemtable sort phase
> ------------------------------------------------
>
>                 Key: CASSANDRA-420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-420
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: CASSANDRA-420.patch
>
>
> The BinaryMemtable sorts an array of decorated keys. There are a lot of string operations and object allocation in the comparator that could be avoided to improve performance..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-420) Improve performance of BinaryMemtable sort phase

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751070#action_12751070 ] 

Chris Goffinet commented on CASSANDRA-420:
------------------------------------------

I can test this on our cluster today.

> Improve performance of BinaryMemtable sort phase
> ------------------------------------------------
>
>                 Key: CASSANDRA-420
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-420
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Johan Oskarsson
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: CASSANDRA-420.patch
>
>
> The BinaryMemtable sorts an array of decorated keys. There are a lot of string operations and object allocation in the comparator that could be avoided to improve performance..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.