You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2009/10/23 00:15:59 UTC

[jira] Created: (CASSANDRA-510) reading from large supercolumns is excessively slow

reading from large supercolumns is excessively slow
---------------------------------------------------

                 Key: CASSANDRA-510
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Jonathan Ellis
            Assignee: Jonathan Ellis
             Fix For: 0.5




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-510) reading from large supercolumns is excessively slow

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773235#action_12773235 ] 

Jonathan Ellis commented on CASSANDRA-510:
------------------------------------------

> Would it be useful to include a test that verifies that an unintended sub column is *not* removed? 

sure.  could you write one? :)

> Why does cloneMe() call markForDeleteAt? 

to set the local and nonlocal deletion times.  (which have "sentinel" values indicating "not really deleted" rather than a separate boolean flag.  so if there has been no delete it's a no-op.)

> reading from large supercolumns is excessively slow
> ---------------------------------------------------
>
>                 Key: CASSANDRA-510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 0001-CASSANDRA-510-don-t-removeDeleted-on-the-whole-CF-befo.txt, 0002-convert-removeDeleted-on-SC-to-remove-oriented-instead.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-510) reading from large supercolumns is excessively slow

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773534#action_12773534 ] 

Jonathan Ellis commented on CASSANDRA-510:
------------------------------------------

what's happening is it is being supressed by the

        rm.delete(new QueryPath("Super1", "SC1".getBytes()), 1);

in another test method.

different test classes are run in different jvms w/ the junit fork option (this is the only sane way to clean out all the stuff that isn't meant to be cleaned out in a running Cassandra), but methods w/in the same class are not.  so you have to be a little extra careful.

I created a new columnfamily Super3 in the test config and had this test use that, and it passes as expected now.

I'll commit like that.

> reading from large supercolumns is excessively slow
> ---------------------------------------------------
>
>                 Key: CASSANDRA-510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 0001-CASSANDRA-510-don-t-removeDeleted-on-the-whole-CF-befo.txt, 0002-convert-removeDeleted-on-SC-to-remove-oriented-instead.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-510) reading from large supercolumns is excessively slow

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-510:
-------------------------------------

    Attachment: 510.patch

don't removeDeleted on the whole CF before filtering what the request was for; it's expensive.  also, fixes subco
lumn queries being counted twice in readStats

> reading from large supercolumns is excessively slow
> ---------------------------------------------------
>
>                 Key: CASSANDRA-510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 510.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-510) reading from large supercolumns is excessively slow

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-510:
-------------------------------------

    Attachment:     (was: 510.patch)

> reading from large supercolumns is excessively slow
> ---------------------------------------------------
>
>                 Key: CASSANDRA-510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 0001-CASSANDRA-510-don-t-removeDeleted-on-the-whole-CF-befo.txt, 0002-convert-removeDeleted-on-SC-to-remove-oriented-instead.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-510) reading from large supercolumns is excessively slow

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768942#action_12768942 ] 

Jonathan Ellis edited comment on CASSANDRA-510 at 10/23/09 3:05 PM:
--------------------------------------------------------------------

02
    convert removeDeleted on SC to remove-oriented instead of clone-then-add-back

01
    don't removeDeleted on the whole CF before filtering what the request was for; it's expensive.
    also, fixes subcolumn queries being counted twice in readStats


      was (Author: jbellis):
    don't removeDeleted on the whole CF before filtering what the request was for; it's expensive.  also, fixes subco
lumn queries being counted twice in readStats
  
> reading from large supercolumns is excessively slow
> ---------------------------------------------------
>
>                 Key: CASSANDRA-510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 0001-CASSANDRA-510-don-t-removeDeleted-on-the-whole-CF-befo.txt, 0002-convert-removeDeleted-on-SC-to-remove-oriented-instead.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-510) reading from large supercolumns is excessively slow

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773559#action_12773559 ] 

Jonathan Ellis commented on CASSANDRA-510:
------------------------------------------

... so it turns out that the test added in 01 fails w/o 02, even when just the test case is run against trunk.  Applying just the part from 02 that clones the SC returned to callers fixes it.

So, applying 01 and 02 after all since we've apparently empirically demonstrated that "returning mutable objects from the memtable that caller is supposed to remember not to mutate" doesn't work in practice. :)

> reading from large supercolumns is excessively slow
> ---------------------------------------------------
>
>                 Key: CASSANDRA-510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 0001-CASSANDRA-510-don-t-removeDeleted-on-the-whole-CF-befo.txt, 0002-convert-removeDeleted-on-SC-to-remove-oriented-instead.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-510) reading from large supercolumns is excessively slow

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773952#action_12773952 ] 

Hudson commented on CASSANDRA-510:
----------------------------------

Integrated in Cassandra #249 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/249/])
    return clones of supercolumns from memtable so caller can't accidentally mutate them, fixing the failing test.

convert removeDeleted on SC to remove-oriented instead of clone-then-add-back to make this hurt performance less.

patch by jbellis; reviewed by gdusbabek for 
don't removeDeleted on the whole CF before filtering what the request was for; it's expensive.  also, fixes subcolumn queries being counted twice in readStats
patch by jbellis; reviewed by gdusbabek for 
add failing test for removing a single subcolumn.
patch by jbellis; reviewed by gdusbabek for 


> reading from large supercolumns is excessively slow
> ---------------------------------------------------
>
>                 Key: CASSANDRA-510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 0001-CASSANDRA-510-don-t-removeDeleted-on-the-whole-CF-befo.txt, 0002-convert-removeDeleted-on-SC-to-remove-oriented-instead.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-510) reading from large supercolumns is excessively slow

Posted by "Gary Dusbabek (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773257#action_12773257 ] 

Gary Dusbabek commented on CASSANDRA-510:
-----------------------------------------

Now I'm confused.  I added:

assertNotNull(store.getColumnFamily(new NamesQueryFilter("key1", new QueryPath("Super1", "SC1".getBytes()), Util.getBytes(2)), Integer.MAX_VALUE));

to verify that the other subcolumn stuck around, and the test failed.  I didn't expect that.

So then I commented out the rm.delete and rm.apply calls so see if the delete was working.  The assertNull in validateRemoveSubColumn still passed.  I didn't expect that either.

So it appears to me that the addMutation calls didn't take.  What am I missing?

> reading from large supercolumns is excessively slow
> ---------------------------------------------------
>
>                 Key: CASSANDRA-510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 0001-CASSANDRA-510-don-t-removeDeleted-on-the-whole-CF-befo.txt, 0002-convert-removeDeleted-on-SC-to-remove-oriented-instead.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-510) reading from large supercolumns is excessively slow

Posted by "Gary Dusbabek (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773232#action_12773232 ] 

Gary Dusbabek commented on CASSANDRA-510:
-----------------------------------------

Just a few noob questions...

Would it be useful to include a test that verifies that an unintended sub column is *not* removed?
Why does cloneMe() call markForDeleteAt?



> reading from large supercolumns is excessively slow
> ---------------------------------------------------
>
>                 Key: CASSANDRA-510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 0001-CASSANDRA-510-don-t-removeDeleted-on-the-whole-CF-befo.txt, 0002-convert-removeDeleted-on-SC-to-remove-oriented-instead.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-510) reading from large supercolumns is excessively slow

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-510:
-------------------------------------

    Attachment: 0002-convert-removeDeleted-on-SC-to-remove-oriented-instead.txt
                0001-CASSANDRA-510-don-t-removeDeleted-on-the-whole-CF-befo.txt

> reading from large supercolumns is excessively slow
> ---------------------------------------------------
>
>                 Key: CASSANDRA-510
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-510
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 0001-CASSANDRA-510-don-t-removeDeleted-on-the-whole-CF-befo.txt, 0002-convert-removeDeleted-on-SC-to-remove-oriented-instead.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.