You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2010/07/27 23:49:24 UTC
[jira] Created: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
make multiget take a set of keys instead of a list
--------------------------------------------------
Key: CASSANDRA-1329
URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
Project: Cassandra
Issue Type: Task
Components: Core
Reporter: Jonathan Ellis
Priority: Minor
Fix For: 0.7.0
this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904432#action_12904432 ]
Stu Hood commented on CASSANDRA-1329:
-------------------------------------
Is this really worth it? Set maintenance is more expensive in pretty much any language.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jon Hermes updated CASSANDRA-1329:
----------------------------------
Attachment: (was: 1329-stresspy-multiget.txt)
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Reopened: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis reopened CASSANDRA-1329:
---------------------------------------
(reopening until this question is resolved)
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jon Hermes updated CASSANDRA-1329:
----------------------------------
Attachment: 1329-stresspy-multiget.txt
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (CASSANDRA-1329) make multiget take a
set of keys instead of a list
Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904432#action_12904432 ]
Stu Hood edited comment on CASSANDRA-1329 at 8/30/10 8:07 PM:
--------------------------------------------------------------
Is this really worth it? Set maintenance is more expensive in pretty much any language.
EDIT: Make that "in all languages".
was (Author: stuhood):
Is this really worth it? Set maintenance is more expensive in pretty much any language.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Bob T. Terminal (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908014#action_12908014 ]
Bob T. Terminal commented on CASSANDRA-1329:
--------------------------------------------
https://issues.apache.org/jira/browse/THRIFT-342, can we revet back to lists until thrift resolves this issue
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-1329:
--------------------------------------
Assignee: Jon Hermes
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jon Hermes updated CASSANDRA-1329:
----------------------------------
Attachment: 1329-stresspy-multiget.txt
stress.py now has multiget.
Benchmark data soon to come.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904287#action_12904287 ]
Jon Hermes commented on CASSANDRA-1329:
---------------------------------------
Only from py code.
It takes the keys in as a list "keys" (despite cassandra.thrift asking for a set), then passes "set(keys)" up, which removes duplicates without warning or complaining.
This isn't specific to this bug or to python, and if seen as an error, would be a thrift defect.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12911010#action_12911010 ]
Hudson commented on CASSANDRA-1329:
-----------------------------------
Integrated in Cassandra #539 (See [https://hudson.apache.org/hudson/job/Cassandra/539/])
change multiget back to list, for now. patch by jbellis for CASSANDRA-1329
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7.0
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904422#action_12904422 ]
Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------
can you rebase?
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-1329:
--------------------------------------
Fix Version/s: 0.7.0
(was: 0.7 beta 2)
reverted for beta2, moving to 0.7.0.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7.0
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jake Farrell (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909386#action_12909386 ]
Jake Farrell commented on CASSANDRA-1329:
-----------------------------------------
THRIFT-342: attached patch which will allow for sets to be used from/to php via thrift allowing for cassandra to use sets without breaking any client generated drivers
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-1329:
--------------------------------------
Fix Version/s: (was: 0.7.0)
0.8
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.8
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (CASSANDRA-1329) make multiget take a
set of keys instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905936#action_12905936 ]
Jon Hermes edited comment on CASSANDRA-1329 at 9/3/10 11:27 AM:
----------------------------------------------------------------
Process was:
- stress.py insert 1m rows
- loop stress.py multiget 100k rows until values stabilized
Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k block out of the 1M written, so everything should be key-cached and there's bound to be more duplicates in the set than I planned for.
{noformat}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5 <-- GC
==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4
==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4
==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{noformat}
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no significant difference in time to process a set versus a list.
was (Author: jhermes):
Process was:
- stress.py insert 1m rows
- loop stress.py multiget 100k rows until values stabilized
Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k block out of the 1M written, so everything should be key-cached and there's bound to be more duplicates in the set than I planned for.
{{noformat}}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5 <-- GC
==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4
==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4
==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{{noformat}}
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no significant difference in time to process a set versus a list.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904405#action_12904405 ]
Jon Hermes commented on CASSANDRA-1329:
---------------------------------------
Yes, there's no (good) way to tell python to complain about set(list) (which should always succeed).
This *will* affect java/staticly-typed code that creates duplicates and tries to convert them to a set to match the API.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904941#action_12904941 ]
Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------
100-1000 keys per multiget would be way more reasonable than 100k.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905103#action_12905103 ]
Jon Hermes commented on CASSANDRA-1329:
---------------------------------------
Whatever gives me reasonable timing data. If I can detect any difference at 1k, then I'll stop there.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909047#action_12909047 ]
Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------
What is the status of set support in other somewhat-less-used languages? (perl and erl are the main ones that people have used w/ cassandra that may not support it).
if php is the only one that doesn't support sets then I think we should just add it to the fairly long list of "stuff that's broken in php thrift."
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904779#action_12904779 ]
Jon Hermes commented on CASSANDRA-1329:
---------------------------------------
@Rebase: Of course.
@Benchmark: Yes. I'll do a py system comparison. Basic outline for the test is to run it on a single node without row caching or key caching. Insert 1M rows in sequential key order, then multiget out 100 random blocks of 100k sequential keys and average, pre-patch and post-patch.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908545#action_12908545 ]
Hudson commented on CASSANDRA-1329:
-----------------------------------
Integrated in Cassandra #533 (See [https://hudson.apache.org/hudson/job/Cassandra/533/])
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903703#action_12903703 ]
Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------
how can you pass in duplicate keys if it's a Set?
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904646#action_12904646 ]
Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------
Jon, can you benchmark (add multiget option to stress.py?) before/after to see if the performance hit Stu mentions is a problem?
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904303#action_12904303 ]
Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------
iow passing in duplicate keys gets you the same result as passing the non-duplicates?
i'm fine calling that a "feature."
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (CASSANDRA-1329) make multiget take a
set of keys instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905936#action_12905936 ]
Jon Hermes edited comment on CASSANDRA-1329 at 9/3/10 11:50 AM:
----------------------------------------------------------------
Process was:
- stress.py insert 1m rows
- loop stress.py multiget 100k rows until values stabilized
Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k block out of the 1M written, so everything should be key-cached and there's bound to be more duplicates in the set than I planned for.
// See attached multiget.test and multigetsmall.test
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no significant difference in time to process a set versus a list.
was (Author: jhermes):
Process was:
- stress.py insert 1m rows
- loop stress.py multiget 100k rows until values stabilized
Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k block out of the 1M written, so everything should be key-cached and there's bound to be more duplicates in the set than I planned for.
{noformat}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5 <-- GC
==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4
==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4
==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{noformat}
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no significant difference in time to process a set versus a list.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905936#action_12905936 ]
Jon Hermes commented on CASSANDRA-1329:
---------------------------------------
Process was:
- stress.py insert 1m rows
- loop stress.py multiget 100k rows until values stabilized
Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k block out of the 1M written, so everything should be key-cached and there's bound to be more duplicates in the set than I planned for.
{{noformat}}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5 <-- GC
==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4
==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4
==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8 <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5 <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{{noformat}}
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no significant difference in time to process a set versus a list.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Bob T. Terminal (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907800#action_12907800 ]
Bob T. Terminal commented on CASSANDRA-1329:
--------------------------------------------
This breaks php since php does not support a set type, http://wiki.apache.org/thrift/ThriftTypes#Containers
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jon Hermes updated CASSANDRA-1329:
----------------------------------
Attachment: 1329-rebase.txt
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907824#action_12907824 ]
Jonathan Ellis commented on CASSANDRA-1329:
-------------------------------------------
is there a thrift ticket open to fix that?
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jon Hermes updated CASSANDRA-1329:
----------------------------------
Attachment: multiget.test
multigetsmall.test
Another quick test using 1k multigets instead of 100k, and the result is the same.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910863#action_12910863 ]
Jon Hermes commented on CASSANDRA-1329:
---------------------------------------
2010-09-17 (05:15:37 PM) mbruce: hermes: multiget_slice works totally fine in perl from current trunk, with keys defined as a set in thrift.
Still no takers on erlang, and I still haven't gotten erlangthrift up and running myself (90% language barrier).
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1329) make multiget take a set of keys
instead of a list
Posted by "Jon Hermes (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jon Hermes updated CASSANDRA-1329:
----------------------------------
Attachment: 1329.txt
Thrift changed, thrift/CassandraServer changed.
All of the logic just iterates for (byte[] key : keys), so changing a set to a list is unchanged.
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
> Key: CASSANDRA-1329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
> Project: Cassandra
> Issue Type: Task
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jon Hermes
> Priority: Minor
> Fix For: 0.7 beta 2
>
> Attachments: 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't matter, and duplicates don't make sense
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.