You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stu Hood (JIRA)" <ji...@apache.org> on 2011/03/07 19:14:59 UTC

[jira] Created: (CASSANDRA-2280) Request specific column families using StreamIn

Request specific column families using StreamIn
-----------------------------------------------

                 Key: CASSANDRA-2280
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Stu Hood
             Fix For: 0.8


StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017944#comment-13017944 ] 

Stu Hood commented on CASSANDRA-2280:
-------------------------------------

> Speaking of which, don't we need some code in StreamOutSession.create to force the cF list to all
No, because the default is the empty list, which sends all. We either have to generate the list of CFs on the source or on the destination, so moving it from one side to the other doesn't save us any code, and requires special casing for backwards compatibility.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105462#comment-13105462 ] 

Jonathan Ellis commented on CASSANDRA-2280:
-------------------------------------------

As mentioned above, we reverted it from 0.8.1 because of CASSANDRA-2818.  We've committed to maintaining drop-in-ability between minor releases, which means we can't release protocol changes unless protocol backwards- and forwards-compatibility actually works.  In this case, 2818 was a forwards-compatibility bug in 0.8.0 and 0.8.1 which means we'd have to say "to upgrade to 0.8.x where x >= 6, you must first upgrade to 0.8.y where 2 <= y < 6."  Which is super confusing to people and honestly, my experience is that 99% of our users don't read NEWS before upgrading anyway so it's totally going to bite a lot of them.

0.7 was not affected by 2818 but it's past the point where we should be making protocol changes.  Any change is risky and the bar is pretty high to make changes to "oldstable," which is what 0.7 is now.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 1.0.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-2280:
----------------------------------------

    Fix Version/s:     (was: 0.8.1)
                   0.8.2

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.2
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-2280:
--------------------------------

    Attachment: 0001-Allow-specific-column-families-to-be-requested-for-str.txt

Attaching a patch that modifies a few streaming messages to (optionally) specify the CFs to repair. Only AES actually uses this feature.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2280:
--------------------------------------

    Reviewer: slebresne  (was: amorton)
    Assignee: Jonathan Ellis  (was: Stu Hood)

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052050#comment-13052050 ] 

Jonathan Ellis commented on CASSANDRA-2280:
-------------------------------------------

(n/m, the way it should work is newer node should send message to old node, in old version format)

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-2280:
--------------------------------

    Attachment:     (was: 0001-Allow-specific-column-families-to-be-requested-for-str.txt)

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Huy Le (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087077#comment-13087077 ] 

Huy Le commented on CASSANDRA-2280:
-----------------------------------

Thanks Jonathan.

Is 1.0 tentative release date still Oct 8th? We are currently running 0.6.11 and would like to upgrade to 0.8.x, but this issue is very much a show stopper.  We run repair every weekend on our 0.6.11 production environment, but the kind of load that I saw while testing repair 0.8.4 in our testing environment is not suitable for production environment.



> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 1.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2280:
--------------------------------------

    Fix Version/s:     (was: 0.8.0)
                   0.8.1

This is a big enough change that I don't want to sneak it into 0.8 RC. Tagging 0.8.1, meaning, commit to trunk for now and backport after 0.8.0.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2280:
--------------------------------------

    Attachment: 2280-v3.txt

bq. StreamInSession (the session created by the requesting node) doesn't record any of the information about the request

That's what I was missing. Thanks for clarifying.

v3 rebases and removes special casing of empty CF list. As expected, this improves encapsulation of special cases (only streaming code needs to care, instead of leaking to anything that might touch a list of CF names).  Also converted to passing CFS objects around instead of strings, resulting in a minor improvement on the amount of manual looping that gets done.

Note that passing table is redundant (CFS objects know their table) and I think the entire "single file" stream mode is unused too, but I've left these alone for now.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017987#comment-13017987 ] 

Jonathan Ellis commented on CASSANDRA-2280:
-------------------------------------------

bq. It will reply with "all of the data"

Right, which is not what is expected.  I'm saying we should make what is expected, match what we'll receive, or we're asking for regressions later (even if it happens to work now as by accident).

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087090#comment-13087090 ] 

Jonathan Ellis commented on CASSANDRA-2280:
-------------------------------------------

(Yes, still shooting for Oct 8 on 1.0.)

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 1.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-2280:
--------------------------------

    Attachment: 0001-Allow-specific-column-families-to-be-requested-for-str.txt

Rebased for trunk.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reopened CASSANDRA-2280:
---------------------------------------


> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2280:
--------------------------------------

    Attachment:     (was: 2280-versioning.txt)

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2280:
--------------------------------------

    Comment: was deleted

(was: you're right.  added a default for when RS=SimpleStrategy in r1092435)

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Huy Le (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087055#comment-13087055 ] 

Huy Le commented on CASSANDRA-2280:
-----------------------------------

Is there any plan to merge this fix into 0.8.x soon?  I checked the latest stable build (https://builds.apache.org/job/Cassandra-0.8/lastStableBuild/artifact/cassandra/build/apache-cassandra-2011-08-17_23-00-56-bin.tar.gz) but I don't see the fix list in CHANGES.txt.  I also checked the latest build (https://builds.apache.org/job/Cassandra-0.8/284/artifact/cassandra/build/apache-cassandra-2011-08-17_23-00-56-bin.tar.gz) and I don't see it listed in CHANGES.txt.

I tried https://builds.apache.org/job/Cassandra-0.8/214/, but got HTTP 404.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 1.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064119#comment-13064119 ] 

Hudson commented on CASSANDRA-2280:
-----------------------------------

Integrated in Cassandra-0.8 #214 (See [https://builds.apache.org/job/Cassandra-0.8/214/])
    add example of commitlog_sync_batch_window_in_ms to .yaml
patch by Wojciech Meler for CASSANDRA-2280

jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1145728
Files : 
* /cassandra/branches/cassandra-0.8/conf/cassandra.yaml


> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 1.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2280:
--------------------------------------

    Attachment: 2280-v4.txt

bq. We must bump the version for 0.8

Done in v4.

bq. In StreamHeader and StreamRequestMessage, Iterables.size() is used

Pretty sure we are passing in an Iterables.concat result, which is not a Collection.  (If not, no reason not to leave that as an option.  Yes, Iterables.size does call .size() on Collection objects.)

bq. Why are we sending the cfs in StreamHeader at all?

Removed in v4.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018012#comment-13018012 ] 

Stu Hood commented on CASSANDRA-2280:
-------------------------------------

As discussed in IRC: StreamInSession (the session created by the requesting node) doesn't record any of the information about the request, so there isn't a place to add the list of expected CFs at the moment. We could fill out StreamInSession some more so that we could add this information, but I feel that that is speculative.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045704#comment-13045704 ] 

Hudson commented on CASSANDRA-2280:
-----------------------------------

Integrated in Cassandra-0.8 #157 (See [https://builds.apache.org/job/Cassandra-0.8/157/])
    restrict repair streaming to specific columnfamilies
patch by stuhood and jbellis; reviewed by slebresne for CASSANDRA-2280

jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1133167
Files : 
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/streaming/StreamIn.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AntiEntropyService.java
* /cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/streaming/SerializationsTest.java
* /cassandra/branches/cassandra-0.8/CHANGES.txt
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/db/Table.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/streaming/StreamRequestVerbHandler.java
* /cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/streaming/StreamingTransferTest.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/streaming/StreamOut.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/streaming/StreamRequestMessage.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java


> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017634#comment-13017634 ] 

Jonathan Ellis commented on CASSANDRA-2280:
-------------------------------------------

bq. otherwise we'd have to perform a lookup during deserialization to build the list for localhost

Right. That's a cleaner approach, since the semantics of what a columnFamilies list is doesn't have to be special cased outside the deserialize, which is the right place to deal with this.

Speaking of which, don't we need some code in StreamOutSession.create to force the cF list to all, if target is an old-version node?  Otherwise what we send, will not be what target expects.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-2280:
--------------------------------

    Attachment:     (was: 0002-Only-flush-matching-CFS.txt)

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-2280:
--------------------------------

    Attachment: 0001-Allow-specific-column-families-to-be-requested-for-str.txt

The reason for special casing the empty list is that it is backwards compatible with older MessagingService versions: otherwise we'd have to perform a lookup during deserialization to build the list for localhost. I think the code delta is about equal either way.

Renamed cfs -> columnFamilies and squashed to one patch.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010287#comment-13010287 ] 

Jonathan Ellis commented on CASSANDRA-2280:
-------------------------------------------

I'd prefer to avoid special-casing empty list and just pass all CF names when that is what we want.  A constructor or factory method that does not take a columnFamilies parameter could default to that, too.  (This is easy w/ CFS.all, but that would require using CFS objects instead of Strings. Which may be better anyway but it is hard to tell w/o trying it.)

Nit: we usually use "cfs" to abbreviate ColumnFamilyStore, suggest expanding to columnFamilies.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0002-Only-flush-matching-CFS.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "anand somani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105493#comment-13105493 ] 

anand somani commented on CASSANDRA-2280:
-----------------------------------------

1.0 might turn out to be late. Is it possible to have this patch on a branch for folks who need it to apply this patch to > 0.8.2? Also what kind of tests (besides the repair) should be done if I apply that patch for a custom build?

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 1.0.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2280:
--------------------------------------

    Attachment: 2280-v5.txt

bq. what about having the list of CFs only in StreamRequestMessage and add the list of cfs to use as an argument to StreamOut.transferRanges()

Good idea. Done in v5. Also some refactoring so the different transferRanges methods share the same code. (The one for SRVH wasn't actually ever calling session.close which apparently we don't rely on yet, but it was a bug waiting to happen.)

bq. In StreamRequestMessage, we should write the operation type even if version is VERSION_080

Ah... Now I understand what you meant last time. Fixed.

bq. Nitpick: and couldn't we use the cf ids instead of the names ?

Done.

bq. In StreamRequestMessage, the field is a Collection but we're still using Iterables.size() inside

Fixed.

bq. I suppose the bump of MessagingService from 2 to 81 was on purpose ? (I don't mind, just pointing out to make sure)

My thought was that way we'll have VERSION_081=81 next, but I don't care a great deal either.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Aaron Morton (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009005#comment-13009005 ] 

Aaron Morton commented on CASSANDRA-2280:
-----------------------------------------

Cannot see any problems, good to go.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0002-Only-flush-matching-CFS.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045323#comment-13045323 ] 

Sylvain Lebresne commented on CASSANDRA-2280:
---------------------------------------------

In StreamRequestMessage deserializer, in the version > VERSION_080 part, the
type is deserialized again, it should be removed.

It needs rebasing (at least for 0.8 branch) so I didn't run the tests with it, but looks good otherwise.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087058#comment-13087058 ] 

Jonathan Ellis commented on CASSANDRA-2280:
-------------------------------------------

bq. Is there any plan to merge this fix into 0.8.x soon?

No.  This will probably stay 1.0-only.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 1.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2280:
--------------------------------------

    Fix Version/s:     (was: 0.8.2)
                   1.0

Reverted from 0.8 b/c of CASSANDRA-2818.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 1.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2280:
--------------------------------------

    Attachment: 2280-versioning.txt

patch to remove "thou shalt not stream across version changes" special case in IncomingTcpConnection

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt, 2280-versioning.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037826#comment-13037826 ] 

Sylvain Lebresne commented on CASSANDRA-2280:
---------------------------------------------

* If we're going to put that in 0.8.1 (which we should), we cannot rely on MessagingService.VERSION_07. We must bump the version for 0.8.0. Turns out CASSANDRA-2433 already have this problem, so I suggest we introduce a MS.VERSION_080 and stick to that (as a side note, when that's done, we should be careful with StreamRequestMessage as it will have a 0.7 and 0.8.0 part, i.e, we shouldn't blindly s/VERSION_07/VERSION_080 in there).
* In StreamHeader and StreamRequestMessage, Iterables.size() is used. Is there a reason for that ? Though google collections are probably smart enough to not do a full iteration to compute the size when possible, in theory we can't really be sure so I don't see why not use .size() (and use a Collection<> instead of Iterable in StreamHeader, although see next point).
* Why are we sending the cfs in StreamHeader at all. It's never used and I don't see why it should (StreamInSession will know what it receive with each file, no reason why it should know upfront what was the request that initiated the streaming).

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017960#comment-13017960 ] 

Jonathan Ellis commented on CASSANDRA-2280:
-------------------------------------------

bq. the default is the empty list, which sends all

But AES does NOT send the empty list, so an old-version target will reply with the wrong data.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood reassigned CASSANDRA-2280:
-----------------------------------

    Assignee: Stu Hood

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0002-Only-flush-matching-CFS.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038600#comment-13038600 ] 

Sylvain Lebresne commented on CASSANDRA-2280:
---------------------------------------------

* In SSTableLoader, calling Table.open() isn't really neat in that in the case of the 'external' bulk loader, it's a fat client, so that will imply creating directories, etc... for no good reason (I haven't test but I would be surprised it actually throw an exception). We'd better give an empty list. Or even better (in my opinion), my next point.
* I don't find that very "logic" for streamOutSession to take a collection of cfs. The coupling seems unnecessary. The problem we're solving is to ask another node to transfer us some range for some CF. So what about having the list of CFs only in StreamRequestMessage and add the list of cfs to use as an argument to StreamOut.transferRanges() ? We don't need it anywhere else.
* In StreamRequestMessage, we should write the operation type even if version is VERSION_080 (same for deserialization). Nitpick: and couldn't we use the cf ids instead of the names ?
* In StreamRequestMessage, the field is a Collection but we're still using Iterables.size() inside. Pretty sure that doesn't leave much option :) I mean, my remark was more about saying "why add something that may make people wonder for no reason" since that's not something that is widespread in the code. Anyway, just saying, I don't care.
* I suppose the bump of MessagingService from 2 to 81 was on purpose ? (I don't mind, just pointing out to make sure)


> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.1
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017984#comment-13017984 ] 

Stu Hood commented on CASSANDRA-2280:
-------------------------------------

> But AES does NOT send the empty list, so an old-version target will reply with the wrong data.
It will reply with "all of the data". The only alternative would be for it to not send anything at all since we have no way for an old-version and new-version to communicate which CFs they want to send or receive. I think that "do what we've always done" is a reasonable backwards compatibility strategy.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stu Hood updated CASSANDRA-2280:
--------------------------------

    Attachment: 0002-Only-flush-matching-CFS.txt

Adding 0002 to only flush matching CFS.

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0002-Only-flush-matching-CFS.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087089#comment-13087089 ] 

Jonathan Ellis commented on CASSANDRA-2280:
-------------------------------------------

This is present in 0.6 too, so not sure how to explain that. :)

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 1.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "anand somani (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105408#comment-13105408 ] 

anand somani commented on CASSANDRA-2280:
-----------------------------------------

I noticed there was some attempt to port this to 0.8, but then comment by Jonathan that this will not be ported to 0.8.x. 
I suppose that means it will not be ported to 0.7.x? 
Is this because it is not possible just too many unresolvable conflicts? or what?

> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Jonathan Ellis
>             Fix For: 1.0.0
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 0001-Allow-specific-column-families-to-be-requested-for-str.txt, 2280-v3.txt, 2280-v4.txt, 2280-v5.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (CASSANDRA-2280) Request specific column families using StreamIn

Posted by "Aaron Morton (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006164#comment-13006164 ] 

Aaron Morton commented on CASSANDRA-2280:
-----------------------------------------

StreamOut.transferRangesForRequest() flushes all SSTables for the keyspace even if we know the CFs. Can it flush just the CF's it is sending? 

Although CompactionManager.doValidation() also forces the CF to flush, so it may bo not be necessary when streaming for repair. May still be necessary for StreamOut.transferRanges() as it is used during move and decomission. 

Otherwise no problems.
 
Jonathan has moved CASSANDRA-2088 to 0.8 because the counters make it difficult to share compaction code with 0.7. I'll now do that ticket on top of this one.  


> Request specific column families using StreamIn
> -----------------------------------------------
>
>                 Key: CASSANDRA-2280
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2280
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>             Fix For: 0.8
>
>         Attachments: 0001-Allow-specific-column-families-to-be-requested-for-str.txt
>
>
> StreamIn.requestRanges only specifies a keyspace, meaning that requesting a range will request it for all column families: if you have a large number of CFs, this can cause quite a headache.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira