You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2010/07/27 23:49:24 UTC

[jira] Created: (CASSANDRA-1329) make multiget take a set of keys instead of a list

make multiget take a set of keys instead of a list
--------------------------------------------------

                 Key: CASSANDRA-1329
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
             Project: Cassandra
          Issue Type: Task
          Components: Core
            Reporter: Jonathan Ellis
            Priority: Minor
             Fix For: 0.7.0


this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904432#action_12904432 ] 

Stu Hood commented on CASSANDRA-1329:
-------------------------------------

Is this really worth it? Set maintenance is more expensive in pretty much any language.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jon Hermes updated CASSANDRA-1329:
----------------------------------

    Attachment:     (was: 1329-stresspy-multiget.txt)

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reopened CASSANDRA-1329:
---------------------------------------


(reopening until this question is resolved)

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jon Hermes updated CASSANDRA-1329:
----------------------------------

    Attachment: 1329-stresspy-multiget.txt

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904432#action_12904432 ] 

Stu Hood edited comment on CASSANDRA-1329 at 8/30/10 8:07 PM:
--------------------------------------------------------------

Is this really worth it? Set maintenance is more expensive in pretty much any language.

EDIT: Make that "in all languages".

      was (Author: stuhood):
    Is this really worth it? Set maintenance is more expensive in pretty much any language.
  
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Bob T. Terminal (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908014#action_12908014 ] 

Bob T. Terminal commented on CASSANDRA-1329:
--------------------------------------------

https://issues.apache.org/jira/browse/THRIFT-342, can we revet back to lists until thrift resolves this issue

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1329:
--------------------------------------

    Assignee: Jon Hermes

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jon Hermes updated CASSANDRA-1329:
----------------------------------

    Attachment: 1329-stresspy-multiget.txt

stress.py now has multiget.
Benchmark data soon to come.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904287#action_12904287 ] 

Jon Hermes commented on CASSANDRA-1329:
---------------------------------------

Only from py code.
It takes the keys in as a list "keys" (despite cassandra.thrift asking for a set), then passes "set(keys)" up, which removes duplicates without warning or complaining.
This isn't specific to this bug or to python, and if seen as an error, would be a thrift defect.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12911010#action_12911010 ] 

Hudson commented on CASSANDRA-1329:
-----------------------------------

Integrated in Cassandra #539 (See [https://hudson.apache.org/hudson/job/Cassandra/539/])
    change multiget back to list, for now.  patch by jbellis for CASSANDRA-1329


> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904422#action_12904422 ] 

Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------

can you rebase?

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1329:
--------------------------------------

    Fix Version/s: 0.7.0
                       (was: 0.7 beta 2)

reverted for beta2, moving to 0.7.0.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jake Farrell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909386#action_12909386 ] 

Jake Farrell commented on CASSANDRA-1329:
-----------------------------------------

THRIFT-342: attached patch which will allow for sets to be used from/to php via thrift allowing for cassandra to use sets without breaking any client generated drivers 

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-1329:
--------------------------------------

    Fix Version/s:     (was: 0.7.0)
                   0.8

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905936#action_12905936 ] 

Jon Hermes edited comment on CASSANDRA-1329 at 9/3/10 11:27 AM:
----------------------------------------------------------------

Process was:
 - stress.py insert 1m rows
 - loop stress.py multiget 100k rows until values stabilized

Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k block out of the 1M written, so everything should be key-cached and there's bound to be more duplicates in the set than I planned for.
{noformat}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5                                           <-- GC

==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4

==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4

==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5                                            <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{noformat}
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no significant difference in time to process a set versus a list.

      was (Author: jhermes):
    Process was:
 - stress.py insert 1m rows
 - loop stress.py multiget 100k rows until values stabilized

Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k block out of the 1M written, so everything should be key-cached and there's bound to be more duplicates in the set than I planned for.
{{noformat}}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5                                           <-- GC

==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4

==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4

==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5                                            <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{{noformat}}
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no significant difference in time to process a set versus a list.
  
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904405#action_12904405 ] 

Jon Hermes commented on CASSANDRA-1329:
---------------------------------------

Yes, there's no (good) way to tell python to complain about set(list) (which should always succeed).
This *will* affect java/staticly-typed code that creates duplicates and tries to convert them to a set to match the API.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904941#action_12904941 ] 

Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------

100-1000 keys per multiget would be way more reasonable than 100k.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905103#action_12905103 ] 

Jon Hermes commented on CASSANDRA-1329:
---------------------------------------

Whatever gives me reasonable timing data. If I can detect any difference at 1k, then I'll stop there.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909047#action_12909047 ] 

Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------

What is the status of set support in other somewhat-less-used languages? (perl and erl are the main ones that people have used w/ cassandra that may not support it).

if php is the only one that doesn't support sets then I think we should just add it to the fairly long list of "stuff that's broken in php thrift."

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904779#action_12904779 ] 

Jon Hermes commented on CASSANDRA-1329:
---------------------------------------

@Rebase: Of course.
@Benchmark: Yes. I'll do a py system comparison. Basic outline for the test is to run it on a single node without row caching or key caching. Insert 1M rows in sequential key order, then multiget  out 100 random blocks of 100k sequential keys and average, pre-patch and post-patch.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908545#action_12908545 ] 

Hudson commented on CASSANDRA-1329:
-----------------------------------

Integrated in Cassandra #533 (See [https://hudson.apache.org/hudson/job/Cassandra/533/])
    

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903703#action_12903703 ] 

Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------

how can you pass in duplicate keys if it's a Set?

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904646#action_12904646 ] 

Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------

Jon, can you benchmark (add multiget option to stress.py?) before/after to see if the performance hit Stu mentions is a problem?

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904303#action_12904303 ] 

Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------

iow passing in duplicate keys gets you the same result as passing the non-duplicates?

i'm fine calling that a "feature."

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905936#action_12905936 ] 

Jon Hermes edited comment on CASSANDRA-1329 at 9/3/10 11:50 AM:
----------------------------------------------------------------

Process was:
 - stress.py insert 1m rows
 - loop stress.py multiget 100k rows until values stabilized

Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k block out of the 1M written, so everything should be key-cached and there's bound to be more duplicates in the set than I planned for.

// See attached multiget.test and multigetsmall.test

Regardless of the vagaries, the numbers are still comparable, and it looks like there is no significant difference in time to process a set versus a list.

      was (Author: jhermes):
    Process was:
 - stress.py insert 1m rows
 - loop stress.py multiget 100k rows until values stabilized

Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k block out of the 1M written, so everything should be key-cached and there's bound to be more duplicates in the set than I planned for.
{noformat}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5                                           <-- GC

==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4

==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4

==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5                                            <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{noformat}
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no significant difference in time to process a set versus a list.
  
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905936#action_12905936 ] 

Jon Hermes commented on CASSANDRA-1329:
---------------------------------------

Process was:
 - stress.py insert 1m rows
 - loop stress.py multiget 100k rows until values stabilized

Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k block out of the 1M written, so everything should be key-cached and there's bound to be more duplicates in the set than I planned for.
{{noformat}}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5                                           <-- GC

==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4

==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4

==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5                                            <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{{noformat}}
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no significant difference in time to process a set versus a list.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Bob T. Terminal (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907800#action_12907800 ] 

Bob T. Terminal commented on CASSANDRA-1329:
--------------------------------------------

This breaks php since php does not support a set type, http://wiki.apache.org/thrift/ThriftTypes#Containers

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jon Hermes updated CASSANDRA-1329:
----------------------------------

    Attachment: 1329-rebase.txt

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907824#action_12907824 ] 

Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------

is there a thrift ticket open to fix that?

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jon Hermes updated CASSANDRA-1329:
----------------------------------

    Attachment: multiget.test
                multigetsmall.test

Another quick test using 1k multigets instead of 100k, and the result is the same.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910863#action_12910863 ] 

Jon Hermes commented on CASSANDRA-1329:
---------------------------------------

2010-09-17 (05:15:37 PM) mbruce: hermes: multiget_slice works totally fine in perl from current trunk, with keys defined as a set in thrift.


Still no takers on erlang, and I still haven't gotten erlangthrift up and running myself (90% language barrier).

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys instead of a list

Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jon Hermes updated CASSANDRA-1329:
----------------------------------

    Attachment: 1329.txt

Thrift changed, thrift/CassandraServer changed. 
All of the logic just iterates for (byte[] key : keys), so changing a set to a list is unchanged.

> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.