You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Robert Coli (JIRA)" <ji...@apache.org> on 2013/12/17 01:53:07 UTC
[jira] [Comment Edited] (CASSANDRA-6465) DES scores fluctuate too
much for cache pinning
[ https://issues.apache.org/jira/browse/CASSANDRA-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849949#comment-13849949 ]
Robert Coli edited comment on CASSANDRA-6465 at 12/17/13 12:52 AM:
-------------------------------------------------------------------
Are we sure that this mechanism of producing cache pinning is worth the complexity here, especially given speculative execution?
was (Author: rcoli):
Are we sure that this mechanism of producing cache pinning is worth the complexity here, especially given speculative retry?
> DES scores fluctuate too much for cache pinning
> -----------------------------------------------
>
> Key: CASSANDRA-6465
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6465
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: 1.2.11, 2 DC cluster
> Reporter: Chris Burroughs
> Assignee: Tyler Hobbs
> Priority: Minor
> Labels: gossip
> Fix For: 2.0.4
>
> Attachments: des-score-graph.png, des.sample.15min.csv, get-scores.py
>
>
> To quote the conf:
> {noformat}
> # if set greater than zero and read_repair_chance is < 1.0, this will allow
> # 'pinning' of replicas to hosts in order to increase cache capacity.
> # The badness threshold will control how much worse the pinned host has to be
> # before the dynamic snitch will prefer other replicas over it. This is
> # expressed as a double which represents a percentage. Thus, a value of
> # 0.2 means Cassandra would continue to prefer the static snitch values
> # until the pinned host was 20% worse than the fastest.
> dynamic_snitch_badness_threshold: 0.1
> {noformat}
> An assumption of this feature is that scores will vary by less than dynamic_snitch_badness_threshold during normal operations. Attached is the result of polling a node for the scores of 6 different endpoints at 1 Hz for 15 minutes. The endpoints to sample were chosen with `nodetool getendpoints` for row that is known to get reads. The node was acting as a coordinator for a few hundred req/second, so it should have sufficient data to work with. Other traces on a second cluster have produced similar results.
> * The scores vary by far more than I would expect, as show by the difficulty of seeing anything useful in that graph.
> * The difference between the best and next-best score is usually > 10% (default dynamic_snitch_badness_threshold).
> Neither ClientRequest nor ColumFamily metrics showed wild changes during the data gathering period.
> Attachments:
> * jython script cobbled together to gather the data (based on work on the mailing list from Maki Watanabe a while back)
> * csv of DES scores for 6 endpoints, polled about once a second
> * Attempt at making a graph
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)