You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Yifan Cai (Jira)" <ji...@apache.org> on 2020/01/27 21:32:00 UTC
[jira] [Comment Edited] (CASSANDRA-2848) Make the Client API support passing down timeouts

    [ https://issues.apache.org/jira/browse/CASSANDRA-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024659#comment-17024659 ] 

Yifan Cai edited comment on CASSANDRA-2848 at 1/27/20 9:31 PM:
---------------------------------------------------------------

[Code|https://github.com/yifan-c/cassandra/tree/CASSANDRA-2848-per-request-timeout], [PR|https://github.com/apache/cassandra/pull/432], [Test|https://app.circleci.com/jobs/github/yifan-c/cassandra/207]

At high-level, this patch allows clients to set an optional timeout for each query. If the timeout is present and it is valid to be applied, the coordinator adjusts the waiting time and the expiration time of the internode messages are adjusted accordingly.  

The optional timeout has the following specifications: 
 * New flag & option added in V5 native protocol to allow specifying the optional timeout. 
 * The timeout is in milliseconds and the valid range is (0, 2147483647), not inclusive. 
 * Data type is Integer. Not using short because the max value of short is 32767, which may not satisfy all cases. 
 * Default value is Integer.MAX_VALUE
 * The timeout value cannot exceeds the database configured one effectively. 
 ** Perform Math.min between the optional timeout (defaults to Integer.MAX_VALUE) and the configured one.

 *Handling the optional timeout*

The server (coordinator) receives the request and deserializes message with the QueryOptions. QueryState is then created with the computed timings, including the timeout for this query. The query is processed and the coordinator waits for the time duration according to the computed timeout in QueryState. The expiration of the internode messages are adjusted according to the timeout in QueryState, so that the message can be dropped properly if timeout. 

QueryOptions contains all the options set by client. The values of the options remains immutable. 

QueryState stores all the server-side computed values. For example, it has the resolved timeout for the query. QueryState is now truly one per query as an outcome of patch CASSANDRA-14677.

{{CQLStatement}} gains a new method {{void resolveTimeout(QueryOptions options, QueryState state);}} in order to compute the proper timeout for the query at the start of query processing. The resolved timeout is stored in QueryState, which is passed along to the code that 1) awaits for response and 2) generates the internode messages. The method is implemented in the concrete CQL statements and it is called at `QueryHandler#process`

*Read path*
 * Single read: adjust the timeout and capped at `ReadRpcTimeout`
 * Range read: adjust the timeout and capped at `RnageRpcTimeout`
 * Blocking read repair: adjust the timeout of the repair mutation phase to the remaining of the read timeout 

*Write path*
 * Counter mutation: adjust the timeout and capped at `CounterWriteRpcTimeout`
 * Normal mutation: adjust the timeout and capped at `WriteRpcTimeout`
 * CAS: adjust the timeout and capped at `WriteRpcTimeout`
 ** There is another timeout value to track the contention time (CasContentionTimeout) during cas. This timeout remains unchanged. 
 * Batch
 ** Counter batch: see counter mutation 
 ** CAS or normal: adjust the time and capped at `WriteRpcTimeout`

*Mix-versions compatibility*

The feature is available only in protocol V5. 

*_Client-Server compatibility_* 
| |Server V4|Server V5|
|Client V4|Good.|Good. The V4 client will not set the optional timeout. The V5 server uses the default value, Integer.MAX_VALUE|
|Client V5|Good. But the client acts just like a v4 one. |Good.|

_*Server-Server compatibility*_
| |Server V4 (Replica)|Server V5 (Replica)|
|Server V4 (Coordinator)|Good.|Good.|
|Server V5 (Coordinator)|Good. The V5 coordinator updates the expiration. The V4 replica nodes honor it. |Good.|

*Integration with other components*

Full query logging: the optional timeout exists in QueryOptions. It is already encoded into the log entry properly. 

Audit logging: right now, a timed out query is already logged as `REQUEST_FAILURE` with the exception message by the audit logger. Is it worthy to add the additional timeout in nanoseconds? 

_Added a diagram to visually describe the patch_
https://gist.github.com/yifan-c/11780706d3666034f940f3476d37c5d8#file-cassandra-2848-per-request-timeout-svg


was (Author: yifanc):
[Code|https://github.com/yifan-c/cassandra/tree/CASSANDRA-2848-per-request-timeout], [PR|https://github.com/apache/cassandra/pull/432], [Test|https://app.circleci.com/jobs/github/yifan-c/cassandra/207]

At high-level, this patch allows clients to set an optional timeout for each query. If the timeout is present and it is valid to be applied, the coordinator adjusts the waiting time and the expiration time of the internode messages are adjusted accordingly.  

The optional timeout has the following specifications: 
 * New flag & option added in V5 native protocol to allow specifying the optional timeout. 
 * The timeout is in milliseconds and the valid range is (0, 2147483647), not inclusive. 
 * Data type is Integer. Not using short because the max value of short is 32767, which may not satisfy all cases. 
 * Default value is Integer.MAX_VALUE
 * The timeout value cannot exceeds the database configured one effectively. 
 ** Perform Math.min between the optional timeout (defaults to Integer.MAX_VALUE) and the configured one.

 *Handling the optional timeout*

The server (coordinator) receives the request and deserializes message with the QueryOptions. QueryState is then created with the computed timings, including the timeout for this query. The query is processed and the coordinator waits for the time duration according to the computed timeout in QueryState. The expiration of the internode messages are adjusted according to the timeout in QueryState, so that the message can be dropped properly if timeout. 

QueryOptions contains all the options set by client. The values of the options remains immutable. 

QueryState stores all the server-side computed values. For example, it has the resolved timeout for the query. QueryState is now truly one per query as an outcome of patch CASSANDRA-14677.

{{CQLStatement}} gains a new method {{void resolveTimeout(QueryOptions options, QueryState state);}} in order to compute the proper timeout for the query at the start of query processing. The resolved timeout is stored in QueryState, which is passed along to the code that 1) awaits for response and 2) generates the internode messages. The method is implemented in the concrete CQL statements and it is called at `QueryHandler#process`

*Read path*
 * Single read: adjust the timeout and capped at `ReadRpcTimeout`
 * Range read: adjust the timeout and capped at `RnageRpcTimeout`
 * Blocking read repair: adjust the timeout of the repair mutation phase to the remaining of the read timeout 

*Write path*
 * Counter mutation: adjust the timeout and capped at `CounterWriteRpcTimeout`
 * Normal mutation: adjust the timeout and capped at `WriteRpcTimeout`
 * CAS: adjust the timeout and capped at `WriteRpcTimeout`
 ** There is another timeout value to track the contention time (CasContentionTimeout) during cas. This timeout remains unchanged. 
 * Batch
 ** Counter batch: see counter mutation 
 ** CAS or normal: adjust the time and capped at `WriteRpcTimeout`

*Mix-versions compatibility*

The feature is available only in protocol V5. 

*_Client-Server compatibility_* 
| |Server V4|Server V5|
|Client V4|Good.|Good. The V4 client will not set the optional timeout. The V5 server uses the default value, Integer.MAX_VALUE|
|Client V5|Good. But the client acts just like a v4 one. |Good.|

_*Server-Server compatibility*_
| |Server V4 (Replica)|Server V5 (Replica)|
|Server V4 (Coordinator)|Good.|Good.|
|Server V5 (Coordinator)|Good. The V5 coordinator updates the expiration. The V4 replica nodes honor it. |Good.|

*Integration with other components*

Full query logging: the optional timeout exists in QueryOptions. It is already encoded into the log entry properly. 

Audit logging: right now, a timed out query is already logged as `REQUEST_FAILURE` with the exception message by the audit logger. Is it worthy to add the additional timeout in nanoseconds? 

> Make the Client API support passing down timeouts
> -------------------------------------------------
>
>                 Key: CASSANDRA-2848
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2848
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Messaging/Client, Messaging/Internode
>            Reporter: Chris Goffinet
>            Assignee: Yifan Cai
>            Priority: Low
>              Labels: protocolv5
>             Fix For: 4.0-beta
>
>         Attachments: 2848-trunk-v2.txt, 2848-trunk.txt
>
>
> Having a max server RPC timeout is good for worst case, but many applications that have middleware in front of Cassandra, might have higher timeout requirements. In a fail fast environment, if my application starting at say the front-end, only has 20ms to process a request, and it must connect to X services down the stack, by the time it hits Cassandra, we might only have 10ms. I propose we provide the ability to specify the timeout on each call we do optionally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org