You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Peter Schuller (Created) (JIRA)" <ji...@apache.org> on 2012/02/05 10:57:53 UTC

[jira] [Created] (CASSANDRA-3852) use LIFO queueing policy when queue size exceeds thresholds

use LIFO queueing policy when queue size exceeds thresholds
-----------------------------------------------------------

Key: CASSANDRA-3852
URL: https://issues.apache.org/jira/browse/CASSANDRA-3852
Project: Cassandra
Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller

A strict FIFO policy for queueing (between stages) is detrimental to latency and forward progress. Whenever a node is saturated beyond incoming request rate, *all* requests become slow. If it is consistently saturated, you start effectively timing out on *all* requests.

A much better strategy from the point of view of latency is to serve a subset requests quickly, and letting some time out, rather than letting all either time out or be slow.

Care must be taken such that:

* We still guarantee that requests are processed reasonably timely (we couldn't go strict LIFO for example as that would result in requests getting stuck potentially forever on a loaded node).
* Maybe, depending on the previous point's solution, ensure that some requests bypass the policy and get prioritized (e.g., schema migrations, or anything "internal" to a node).

A possible implementation is to go LIFO whenever there are requests in the queue that are older than N milliseconds (or a certain queue size, etc).

Benefits:

* All cases where the client is directly, or is indirectly affecting through other layers, a system which has limited concurrency (e.g., thread pool size of X to serve some incoming request rate), it is *much* better for a few requests to time out while most are serviced quickly, than for all requests to become slow, as it doesn't explode concurrency. Think any random non-super-advanced php app, ruby web app, java servlet based app, etc. Essentially, it optimizes very heavily for improved average latencies.

* Systems with strict p95/p99/p999 requirements on latencies should greatly benefit from such a policy. For example, suppose you have a system at 85% of capacity, and it takes a write spike (or has a hiccup like GC pause, blocking on a commit log write, etc). Suppose the hiccup racks up 500 ms worth of requests. At 15% margin at steady state, that takes 500ms * 100/15 = 3.2 seconds to recover. Instead of *all* requests for an entire 3.2 second window being slow, we'd serve requests quickly for 2.7 of those seconds, with the incoming requests during that 500 ms interval being the ones primarily affected. The flip side though is that once you're at the point where more than N percent of requests end up having to wait for others to take LIFO priority, the p(100-N) latencies will actually be *worse* than without this change (but at this point you have to consider what the root reason for those pXX requirements are).

* In the case of complete saturation, it allows forward progress. Suppose you're taking 25% more traffic than you are able to handle. Instead of getting backed up and ending up essentially timing out *every single request*, you will succeed in processing up to 75% of them (I say "up to" because it depends; for example on a {{QUORUM}} request you need at least two of the requests from the co-ordinator to succeed so the percentage is brought down) and allowing clients to make forward progress and get work done, rather than being stuck.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3852) use LIFO queueing policy when queue size exceeds thresholds

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200936#comment-13200936 ] 

Peter Schuller commented on CASSANDRA-3852:
-------------------------------------------

I feel exactly the opposite in almost all practical use-cases I have ever encountered in production. The impact of a service (such as cassandra) becoming a tarpit to all requests tends to cause indirect effects on other things unless every component in the stack is perfectly written to be independent of e.g. concurrency spiking; having a select few requests be slightly slower is typically a much less impactful behavior.

More specifically, failure modes where you have a lack of forward progress is a classic problem that affect systems with multiple components. One classic way to get into this position is to have timeouts incorrectly set such that the backend system is doing work whose results are thrown away because a system in front of it had a lower timeout. This is why the default should be to have the rpc timeout in Cassandra higher than the client timeout. In this case, any system not perfectly built to have latencies correctly configure will be much better served by falling back to LIFO.

Further, I strongly disagree that optimizing for the non-overload case is the right approach. In most cases, averages are given much too much credence. It's completely irrelevant to most serious production use-cases whether your average is 5 ms instead of 6 ms; what tends to be much more important is that you don't explode/fall over as soon as there is a hiccup somewhere. This is a similar observation to the discussion in CASSANDRA-2540. Slight differences in the average (or median) latency is not what brings down a production system; sudden explosions and things falling over, is.

Also, consider that an overload condition is not this once-in-a-year case where someone forgot to add capacity. This behavior characteristic applies whenever:

* A node goes into a GC pause.
* Node(s) are restarted and caches are cold.
* Node(s) are backed up by streaming/compaction (easily a problem if practical concerns caused a late capacity add, especially as long as CASSANDRA-3569 is in effect).
* The operating system decides to flush dirty buffers to disk and for some reason (streaming, compaction) there's a non-trivial impact (say, 500 ms) on I/O.
* Probably a slew of cases I'm not thinking of :)

Real production systems aren't very usefully judged by the overall latencies over time without looking at outliers (not just in terms of pXX latencies, outliers in behavior, etc); what matters is whether the system works reliably and well consistently across a longer period of time, rather than having hiccups significantly affecting other services every now and then as soon as there is some issue with a node. At least this has been all my experience so far, ever, with systems running at high throughput/capacity (rather than 90% idle) where high availability is a concern. 

I think that if someone truly has a situation where surrounding systems are perfect and they really truly care about a millisecond here and there on averages, or care more about absolutely minimizing outliers in non-overload conditions, they can easily turn the feature I'm suggesting here off (or not turn it on, if it's not the default), and they should be worried about enabling data reads by default, submitting requests to multiple co-ordinators to avoid co-ordinator GC pauses (probabilistically), replace the dynamic snitch to something that reacts faster (even if it takes CPU expense), etc. I am not saying these people don't exist (we have potential use-cases here with very strict latency requirements which will probably take measure like these to make it feasable), but I'm saying that these situations are not really serviced right now *anyway* and it's sufficiently special that it's okay if default behavior is not optimizing for this case.

The normal case, with clients being significantly imperfect systems, often things like web apps or clients that have concurrency limits and aren't perfect asynchronous event loops with tight control over e.g. memory use, and that thus are susceptible to negative effects incurred by concurrency spikes, is, I think, much better served by what I'm suggesting.

In addition, even assuming perfectly working systems, consider the user visible impact of a single node being temporarily overloaded. Instead of every single user whose request hit this node experiencing low latency, a tiny subset could potentially have that experience.

I would argue that a better way to avoid "normal steady state" outliers is to do things like CASSANDRA-2540 data reads by default, ensuring the dynamic snitch is real-time and looks at current outstanding requests, etc. Probabilistic improvements that attempt to mitigate the effects of individual variances in the system, whether they be due to LIFO, GC pauses or other things. It feels strange to retain strict FIFO to eliminate outliers, while we still have a failure detector/dynamic snitch combination which utterly fails to handle the most trivial case you can imagine (again, the "tcp connection dies with a reset and I'm gonna keep queuing up hundreds of thousands of outstanding requests to this node blindly while other nodes are not even close to capacity and have < 10 outstanding requests"). I don't understand how the latter can be acceptable if we are so worried about outliers that moving away from strict FIFO is not acceptable.

In summary: While I certainly do not claim at all that the proposed LIFO mechanism is perfect (far from it), I think the disadvantages incurred by it are tiny compared to some of the other behaviors that need to be fixed, while the positive advantages that would be had in common production scenarios are very significant.

In any case, given that at least a draft prototype is not very difficult to get done I'll definitely go ahead and do something (not sure when) with this. We might need to come back with actual numbers from production clusters (not a promise, depends on priorities and other things).


                
> use LIFO queueing policy when queue size exceeds thresholds
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-3852
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3852
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>
> A strict FIFO policy for queueing (between stages) is detrimental to latency and forward progress. Whenever a node is saturated beyond incoming request rate, *all* requests become slow. If it is consistently saturated, you start effectively timing out on *all* requests.
> A much better strategy from the point of view of latency is to serve a subset requests quickly, and letting some time out, rather than letting all either time out or be slow.
> Care must be taken such that:
> * We still guarantee that requests are processed reasonably timely (we couldn't go strict LIFO for example as that would result in requests getting stuck potentially forever on a loaded node).
> * Maybe, depending on the previous point's solution, ensure that some requests bypass the policy and get prioritized (e.g., schema migrations, or anything "internal" to a node).
> A possible implementation is to go LIFO whenever there are requests in the queue that are older than N milliseconds (or a certain queue size, etc).
> Benefits:
> * All cases where the client is directly, or is indirectly affecting through other layers, a system which has limited concurrency (e.g., thread pool size of X to serve some incoming request rate), it is *much* better for a few requests to time out while most are serviced quickly, than for all requests to become slow, as it doesn't explode concurrency. Think any random non-super-advanced php app, ruby web app, java servlet based app, etc. Essentially, it optimizes very heavily for improved average latencies.
> * Systems with strict p95/p99/p999 requirements on latencies should greatly benefit from such a policy. For example, suppose you have a system at 85% of capacity, and it takes a write spike (or has a hiccup like GC pause, blocking on a commit log write, etc). Suppose the hiccup racks up 500 ms worth of requests. At 15% margin at steady state, that takes 500ms * 100/15 = 3.2 seconds to recover. Instead of *all* requests for an entire 3.2 second window being slow, we'd serve requests quickly for 2.7 of those seconds, with the incoming requests during that 500 ms interval being the ones primarily affected. The flip side though is that once you're at the point where more than N percent of requests end up having to wait for others to take LIFO priority, the p(100-N) latencies will actually be *worse* than without this change (but at this point you have to consider what the root reason for those pXX requirements are).
> * In the case of complete saturation, it allows forward progress. Suppose you're taking 25% more traffic than you are able to handle. Instead of getting backed up and ending up essentially timing out *every single request*, you will succeed in processing up to 75% of them (I say "up to" because it depends; for example on a {{QUORUM}} request you need at least two of the requests from the co-ordinator to succeed so the percentage is brought down) and allowing clients to make forward progress and get work done, rather than being stuck.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3852) use LIFO queueing policy when queue size exceeds thresholds

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200890#comment-13200890 ] 

Jonathan Ellis commented on CASSANDRA-3852:
-------------------------------------------

bq. FIFO is actually better as long as you have the throughput to process them all

This is my problem with LIFO: it's basically a way to get faster median latency under overload conditions, by dramatically increasing variance under normal, non-overload conditions.  That feels very much like optimizing for the wrong scenario to me.
                
> use LIFO queueing policy when queue size exceeds thresholds
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-3852
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3852
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>
> A strict FIFO policy for queueing (between stages) is detrimental to latency and forward progress. Whenever a node is saturated beyond incoming request rate, *all* requests become slow. If it is consistently saturated, you start effectively timing out on *all* requests.
> A much better strategy from the point of view of latency is to serve a subset requests quickly, and letting some time out, rather than letting all either time out or be slow.
> Care must be taken such that:
> * We still guarantee that requests are processed reasonably timely (we couldn't go strict LIFO for example as that would result in requests getting stuck potentially forever on a loaded node).
> * Maybe, depending on the previous point's solution, ensure that some requests bypass the policy and get prioritized (e.g., schema migrations, or anything "internal" to a node).
> A possible implementation is to go LIFO whenever there are requests in the queue that are older than N milliseconds (or a certain queue size, etc).
> Benefits:
> * All cases where the client is directly, or is indirectly affecting through other layers, a system which has limited concurrency (e.g., thread pool size of X to serve some incoming request rate), it is *much* better for a few requests to time out while most are serviced quickly, than for all requests to become slow, as it doesn't explode concurrency. Think any random non-super-advanced php app, ruby web app, java servlet based app, etc. Essentially, it optimizes very heavily for improved average latencies.
> * Systems with strict p95/p99/p999 requirements on latencies should greatly benefit from such a policy. For example, suppose you have a system at 85% of capacity, and it takes a write spike (or has a hiccup like GC pause, blocking on a commit log write, etc). Suppose the hiccup racks up 500 ms worth of requests. At 15% margin at steady state, that takes 500ms * 100/15 = 3.2 seconds to recover. Instead of *all* requests for an entire 3.2 second window being slow, we'd serve requests quickly for 2.7 of those seconds, with the incoming requests during that 500 ms interval being the ones primarily affected. The flip side though is that once you're at the point where more than N percent of requests end up having to wait for others to take LIFO priority, the p(100-N) latencies will actually be *worse* than without this change (but at this point you have to consider what the root reason for those pXX requirements are).
> * In the case of complete saturation, it allows forward progress. Suppose you're taking 25% more traffic than you are able to handle. Instead of getting backed up and ending up essentially timing out *every single request*, you will succeed in processing up to 75% of them (I say "up to" because it depends; for example on a {{QUORUM}} request you need at least two of the requests from the co-ordinator to succeed so the percentage is brought down) and allowing clients to make forward progress and get work done, rather than being stuck.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3852) use LIFO queueing policy when queue size exceeds thresholds

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200707#comment-13200707 ] 

Peter Schuller commented on CASSANDRA-3852:
-------------------------------------------

Of course, the queueing policy would be tweakable. For example if you have a workload where you don't care about latency at all, except that it is under a specific hard deadline of milliseconds, FIFO is actually better *as long as* you have the throughput to process them all.
                
> use LIFO queueing policy when queue size exceeds thresholds
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-3852
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3852
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>
> A strict FIFO policy for queueing (between stages) is detrimental to latency and forward progress. Whenever a node is saturated beyond incoming request rate, *all* requests become slow. If it is consistently saturated, you start effectively timing out on *all* requests.
> A much better strategy from the point of view of latency is to serve a subset requests quickly, and letting some time out, rather than letting all either time out or be slow.
> Care must be taken such that:
> * We still guarantee that requests are processed reasonably timely (we couldn't go strict LIFO for example as that would result in requests getting stuck potentially forever on a loaded node).
> * Maybe, depending on the previous point's solution, ensure that some requests bypass the policy and get prioritized (e.g., schema migrations, or anything "internal" to a node).
> A possible implementation is to go LIFO whenever there are requests in the queue that are older than N milliseconds (or a certain queue size, etc).
> Benefits:
> * All cases where the client is directly, or is indirectly affecting through other layers, a system which has limited concurrency (e.g., thread pool size of X to serve some incoming request rate), it is *much* better for a few requests to time out while most are serviced quickly, than for all requests to become slow, as it doesn't explode concurrency. Think any random non-super-advanced php app, ruby web app, java servlet based app, etc. Essentially, it optimizes very heavily for improved average latencies.
> * Systems with strict p95/p99/p999 requirements on latencies should greatly benefit from such a policy. For example, suppose you have a system at 85% of capacity, and it takes a write spike (or has a hiccup like GC pause, blocking on a commit log write, etc). Suppose the hiccup racks up 500 ms worth of requests. At 15% margin at steady state, that takes 500ms * 100/15 = 3.2 seconds to recover. Instead of *all* requests for an entire 3.2 second window being slow, we'd serve requests quickly for 2.7 of those seconds, with the incoming requests during that 500 ms interval being the ones primarily affected. The flip side though is that once you're at the point where more than N percent of requests end up having to wait for others to take LIFO priority, the p(100-N) latencies will actually be *worse* than without this change (but at this point you have to consider what the root reason for those pXX requirements are).
> * In the case of complete saturation, it allows forward progress. Suppose you're taking 25% more traffic than you are able to handle. Instead of getting backed up and ending up essentially timing out *every single request*, you will succeed in processing up to 75% of them (I say "up to" because it depends; for example on a {{QUORUM}} request you need at least two of the requests from the co-ordinator to succeed so the percentage is brought down) and allowing clients to make forward progress and get work done, rather than being stuck.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3852) use LIFO queueing policy when queue size exceeds thresholds

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200940#comment-13200940 ] 

Peter Schuller commented on CASSANDRA-3852:
-------------------------------------------

I should probably mention that it's not like this particular issue is the most super-duper important thing ever, and I hope my response doesn't come across as adversarial. I am rather just trying to paint an overall picture of what I think is more important in general and in practice; the LIFO business is rather one particular suggestion for an improvement along those lines, rather than someone being The One True Solution That Solves Everything.
                
> use LIFO queueing policy when queue size exceeds thresholds
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-3852
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3852
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>
> A strict FIFO policy for queueing (between stages) is detrimental to latency and forward progress. Whenever a node is saturated beyond incoming request rate, *all* requests become slow. If it is consistently saturated, you start effectively timing out on *all* requests.
> A much better strategy from the point of view of latency is to serve a subset requests quickly, and letting some time out, rather than letting all either time out or be slow.
> Care must be taken such that:
> * We still guarantee that requests are processed reasonably timely (we couldn't go strict LIFO for example as that would result in requests getting stuck potentially forever on a loaded node).
> * Maybe, depending on the previous point's solution, ensure that some requests bypass the policy and get prioritized (e.g., schema migrations, or anything "internal" to a node).
> A possible implementation is to go LIFO whenever there are requests in the queue that are older than N milliseconds (or a certain queue size, etc).
> Benefits:
> * All cases where the client is directly, or is indirectly affecting through other layers, a system which has limited concurrency (e.g., thread pool size of X to serve some incoming request rate), it is *much* better for a few requests to time out while most are serviced quickly, than for all requests to become slow, as it doesn't explode concurrency. Think any random non-super-advanced php app, ruby web app, java servlet based app, etc. Essentially, it optimizes very heavily for improved average latencies.
> * Systems with strict p95/p99/p999 requirements on latencies should greatly benefit from such a policy. For example, suppose you have a system at 85% of capacity, and it takes a write spike (or has a hiccup like GC pause, blocking on a commit log write, etc). Suppose the hiccup racks up 500 ms worth of requests. At 15% margin at steady state, that takes 500ms * 100/15 = 3.2 seconds to recover. Instead of *all* requests for an entire 3.2 second window being slow, we'd serve requests quickly for 2.7 of those seconds, with the incoming requests during that 500 ms interval being the ones primarily affected. The flip side though is that once you're at the point where more than N percent of requests end up having to wait for others to take LIFO priority, the p(100-N) latencies will actually be *worse* than without this change (but at this point you have to consider what the root reason for those pXX requirements are).
> * In the case of complete saturation, it allows forward progress. Suppose you're taking 25% more traffic than you are able to handle. Instead of getting backed up and ending up essentially timing out *every single request*, you will succeed in processing up to 75% of them (I say "up to" because it depends; for example on a {{QUORUM}} request you need at least two of the requests from the co-ordinator to succeed so the percentage is brought down) and allowing clients to make forward progress and get work done, rather than being stuck.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3852) use LIFO queueing policy when queue size exceeds thresholds

Posted by "Peter Schuller (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200709#comment-13200709 ] 

Peter Schuller commented on CASSANDRA-3852:
-------------------------------------------

A simple way to do a draft version of this for evaluation, is likely to just use a queue for the executors which has two queues internally - one fifo and one lifo. Pushes onto the queue always the the FIFO first and overflow to the LIFO. Pops off the queue always try the LIFO first and then the FIFO.

This would implement the proposition with a probably suboptimal policy of basing the queue decision on number of entries rather than latency information, and does not address custom prioritization. But it's enough to do benchmarking and some graphing.

                
> use LIFO queueing policy when queue size exceeds thresholds
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-3852
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3852
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>
> A strict FIFO policy for queueing (between stages) is detrimental to latency and forward progress. Whenever a node is saturated beyond incoming request rate, *all* requests become slow. If it is consistently saturated, you start effectively timing out on *all* requests.
> A much better strategy from the point of view of latency is to serve a subset requests quickly, and letting some time out, rather than letting all either time out or be slow.
> Care must be taken such that:
> * We still guarantee that requests are processed reasonably timely (we couldn't go strict LIFO for example as that would result in requests getting stuck potentially forever on a loaded node).
> * Maybe, depending on the previous point's solution, ensure that some requests bypass the policy and get prioritized (e.g., schema migrations, or anything "internal" to a node).
> A possible implementation is to go LIFO whenever there are requests in the queue that are older than N milliseconds (or a certain queue size, etc).
> Benefits:
> * All cases where the client is directly, or is indirectly affecting through other layers, a system which has limited concurrency (e.g., thread pool size of X to serve some incoming request rate), it is *much* better for a few requests to time out while most are serviced quickly, than for all requests to become slow, as it doesn't explode concurrency. Think any random non-super-advanced php app, ruby web app, java servlet based app, etc. Essentially, it optimizes very heavily for improved average latencies.
> * Systems with strict p95/p99/p999 requirements on latencies should greatly benefit from such a policy. For example, suppose you have a system at 85% of capacity, and it takes a write spike (or has a hiccup like GC pause, blocking on a commit log write, etc). Suppose the hiccup racks up 500 ms worth of requests. At 15% margin at steady state, that takes 500ms * 100/15 = 3.2 seconds to recover. Instead of *all* requests for an entire 3.2 second window being slow, we'd serve requests quickly for 2.7 of those seconds, with the incoming requests during that 500 ms interval being the ones primarily affected. The flip side though is that once you're at the point where more than N percent of requests end up having to wait for others to take LIFO priority, the p(100-N) latencies will actually be *worse* than without this change (but at this point you have to consider what the root reason for those pXX requirements are).
> * In the case of complete saturation, it allows forward progress. Suppose you're taking 25% more traffic than you are able to handle. Instead of getting backed up and ending up essentially timing out *every single request*, you will succeed in processing up to 75% of them (I say "up to" because it depends; for example on a {{QUORUM}} request you need at least two of the requests from the co-ordinator to succeed so the percentage is brought down) and allowing clients to make forward progress and get work done, rather than being stuck.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira