You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Simon Zhou (JIRA)" <ji...@apache.org> on 2017/02/23 23:26:44 UTC

[jira] [Created] (CASSANDRA-13261) Improve speculative retry to avoid being overloaded

Simon Zhou created CASSANDRA-13261:
--------------------------------------

             Summary: Improve speculative retry to avoid being overloaded
                 Key: CASSANDRA-13261
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13261
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Simon Zhou
            Assignee: Simon Zhou


In CASSANDRA-13009, I was suggested to separate the 2nd part of my patch as an improvement.

This is to avoid Cassandra being overloaded when using CUSTOM speculative retry parameter. Steps to reason/repro this with 3.0.10:
1. Use custom speculative retry threshold like this:
cqlsh> alter TABLE to_repair1.users0 with speculative_retry='10ms';

2. SpeculatingReadExecutor will be used, according to this piece of code in AbstractReadExecutor:
{code}
        if (retry.equals(SpeculativeRetryParam.ALWAYS))
            return new AlwaysSpeculatingReadExecutor(keyspace, cfs, command, consistencyLevel, targetReplicas);
        else // PERCENTILE or CUSTOM.
            return new SpeculatingReadExecutor(keyspace, cfs, command, consistencyLevel, targetReplicas);
{code}

3. When RF=3 and LOCAL_QUORUM is used, the below code (from SpeculatingReadExecutor#maybeTryAdditionalReplicas) won't be able to protect Cassandra from being overloaded, even though the inline comment suggests such intention:

{code}
            // no latency information, or we're overloaded
            if (cfs.sampleLatencyNanos > TimeUnit.MILLISECONDS.toNanos(command.getTimeout()))
                return;
{code}

The reason is that cfs.sampleLatencyNanos is assigned as 
retryPolicy.threshold() which is 10ms in step #1 above, at line 405 of ColumnFamilyStore. However pretty often the timeout is the default one 5000ms.

As the name suggests, sampleLatencyNanos should be used to keep sampled latency, not something configured "statically". My proposal:
a. Introduce option -Dcassandra.overload.threshold to allow customizing overload threshold. The default threshold would be DatabaseDescriptor.getRangeRpcTimeout().
b. Assign sampled P99 latency to cfs.sampleLatencyNanos. For overload detection, we just compare cfs.sampleLatencyNanos with the customizable threshold above.
c. Use retryDelayNanos (instead of cfs.sampleLatencyNanos) for waiting time before retry (see line 282 of AbstractReadExecutor). This is the value from table setting (PERCENTILE or CUSTOM).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)