You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sergio Bossa (JIRA)" <ji...@apache.org> on 2016/07/11 16:06:11 UTC

[jira] [Comment Edited] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator

    [ https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371063#comment-15371063 ] 

Sergio Bossa edited comment on CASSANDRA-9318 at 7/11/16 4:05 PM:
------------------------------------------------------------------

bq. One thing that worries me is, how do you distinguish between “node X is slow because we are writing too fast and we need to throttle clients down” and “node X is slow because it is dying, we need to ignore it and accept writes based on other replicas?” I.e. this seems to implicitly push everyone to a kind of CL.ALL model once your threshold triggers, where if one replica is slow then we can't make progress.

This was already noted by [~slebresne], and you're both right, this initial implementation is heavily biased towards my specific use case :)
But, the above proposed solution should fix it:

bq. I think what you rather need is a way to pre-emptively fail if the write consistency level is not met by enough "non-overloaded" replicas, i.e.: If CL.ONE, fail if all replicas are overloaded...

Also, the exception would be sent to the client only if the low threshold is met, and only the first time it is met, for the duration of the back-pressure window (write RPC timeout), i.e.:
* Threshold is 0.1, outgoing requests are 100, incoming responses are 10, ratio is 0.1.
* Exception is thrown by all write requests for the current back-pressure window.
* The outgoing rate limiter is set at 10, which means the next ratio calculation will approach the sustainable rate, and even if replicas will still lag behind, the ratio will not go down to 0.1 _unless_ the incoming rate dramatically goes down to 1.

This is to say the chances of getting a meltdown due to "overloaded exceptions" is moderate (also, clients are supposed to adjust themselves when getting such exception), and the above proposal should make things play nicely with the CL too.

If you all agree with that, I'll move forward and make that change.


was (Author: sbtourist):
bq. One thing that worries me is, how do you distinguish between “node X is slow because we are writing too fast and we need to throttle clients down” and “node X is slow because it is dying, we need to ignore it and accept writes based on other replicas?”
I.e. this seems to implicitly push everyone to a kind of CL.ALL model once your threshold triggers, where if one replica is slow then we can't make progress.

This was already noted by [~slebresne], and you're both right, this initial implementation is heavily biased towards my specific use case :)
But, the above proposed solution should fix it:

bq. I think what you rather need is a way to pre-emptively fail if the write consistency level is not met by enough "non-overloaded" replicas, i.e.: If CL.ONE, fail if all replicas are overloaded...

Also, the exception would be sent to the client only if the low threshold is met, and only the first time it is met, for the duration of the back-pressure window (write RPC timeout), i.e.:
* Threshold is 0.1, outgoing requests are 100, incoming responses are 10, ratio is 0.1.
* Exception is thrown by all write requests for the current back-pressure window.
* The outgoing rate limiter is set at 10, which means the next ratio calculation will approach the sustainable rate, and even if replicas will still lag behind, the ratio will not go down to 0.1 _unless_ the incoming rate dramatically goes down to 1.

This is to say the chances of getting a meltdown due to "overloaded exceptions" is moderate (also, clients are supposed to adjust themselves when getting such exception), and the above proposal should make things play nicely with the CL too.

If you all agree with that, I'll move forward and make that change.

> Bound the number of in-flight requests at the coordinator
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths, Streaming and Messaging
>            Reporter: Ariel Weisberg
>            Assignee: Sergio Bossa
>         Attachments: 9318-3.0-nits-trailing-spaces.patch, backpressure.png, limit.btm, no_backpressure.png
>
>
> It's possible to somewhat bound the amount of load accepted into the cluster by bounding the number of in-flight requests and request bytes.
> An implementation might do something like track the number of outstanding bytes and requests and if it reaches a high watermark disable read on client connections until it goes back below some low watermark.
> Need to make sure that disabling read on the client connection won't introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)