You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Michael Ho (JIRA)" <ji...@apache.org> on 2017/07/11 20:29:00 UTC

[jira] [Comment Edited] (KUDU-2065) Support cancellation for outbound client RPC

    [ https://issues.apache.org/jira/browse/KUDU-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16082902#comment-16082902 ] 

Michael Ho edited comment on KUDU-2065 at 7/11/17 8:28 PM:
-----------------------------------------------------------

bq. I'm skeptical that this complexity is warranted, have you gathered any statistics on how often outbound transfers are paused during writes and rescheduled? This isn't going to save any bandwidth or buffer-usage on the server side (only potentially client-side). My guess is that the TCP buffers involved are big enough that interrupting transfers happens rarely. Without support for cancellation of in-transfer calls, there's no need for changing the RPC protocol at all, it could be a completely client-side change.

I am pretty new to Kudu so I don't have a good intuition about how likely that will occur. I definitely agree that we should just implement something simpler for the first iteration. In which case, the modification to the proposal would be for interrupted outbound call in {{SENDING}} state to do similar things as in {{SENT}} state: invoke the callback right after all payload has been sent. Ignore any response from the server. Do you have any opinion on whether clearing {{sidecars_}} in {{OutboundCall::SetCancelled()}} is a bad idea ? The goal is by the time the callback happens after a client has called {{RpcController::Cancel()}}, the client code can assume that RPC subsystem should no longer hold references to the sidecar passed to it.

Note that Impala makes the assumption that cancellation of a query will make it to all Impalad nodes within approximately the same time so the remote end of an RPC call will also get a cancellation request too.


was (Author: kwho):
bq. I'm skeptical that this complexity is warranted, have you gathered any statistics on how often outbound transfers are paused during writes and rescheduled? This isn't going to save any bandwidth or buffer-usage on the server side (only potentially client-side). My guess is that the TCP buffers involved are big enough that interrupting transfers happens rarely. Without support for cancellation of in-transfer calls, there's no need for changing the RPC protocol at all, it could be a completely client-side change.

I am pretty new to Kudu so I don't have a good intuition about how likely that will occur. I definitely agree that we should just implement something simpler for the first iteration. In which case, the modification to the proposal would be for interrupted outbound call in {{SENDING}} state to do similar things as in {{SENT}} state: invoke the callback right after all payload has been sent. Ignore any response from the server. Do you have any opinion on whether clearing {{sidecars_}} in {{OutboundCall::SetCancelled()}} is a bad idea ? The goal is by the time the callback happens after a client has called {{RpcController::Cancel()}}, the client code can assume that RPC subsystem should no longer hold references to the sidecar passed to it.

Note that Impala makes the assumption that cancellation of a query will make it to all Impalad nodes within approximately the same time so it's okay for the remote end of an RPC call will also get a cancellation request too.

> Support cancellation for outbound client RPC
> --------------------------------------------
>
>                 Key: KUDU-2065
>                 URL: https://issues.apache.org/jira/browse/KUDU-2065
>             Project: Kudu
>          Issue Type: Improvement
>          Components: rpc
>            Reporter: Michael Ho
>            Priority: Minor
>
> Currently, there is no way to cancel an outbound client RPC call in Kudu. The following is a proposal to do so. Please feel free to comment on it.
> A new interface {{void RpcController::Cancel()}} will be introduced. It enqueues a reactor task which will eventually call {{void OutboundCall::Cancel()}}. A new RpcFeature flag {{CANCELLABLE}} will be added to indicate whether the server can handle in-flight RPC which is cancelled. More details below. A client can specify whether such functionality
> is needed by setting a bool flag in {{RpcController}} passed to the proxy.
> Depending on the state of an OutboundCall, cancellation will happen as follows:
> * READY
> ** it hasn't been scheduled yet. Set the cancellation flag in the OutboundCall object. When it's eventually scheduled by {{Connection::QueueOutboundCall()}}, it will check the cancellation flag before assigining a {{call_id}}. If it's set, call {{OutboundCall::SetCancelled()}} and return.
> * ON_OUTBOUND_QUEUE
> ** it's on the outbound transfer queue but transfer hasn't started yet.
> {{Connection::WriteHandler()}} will check the cancellation flag before initiating a transfer. If it's set, the transfer will be popped from the queue and deleted. Call {{OutboundCall::SetCancelled()}} and return.
> * SENDING
> ** some of the payload has already made its way to the other side. To make sure the outbound call doesn't hold on to {{sidecars_}} till the end of the transfer, the outbound call needs to clear {{sidecars_}} and sends the remaining bytes as 0. The entry in CAR map will be removed and {{OutboundCall::SetCancelled()}} will be invoked. Please see below on how the server will handle this incomplete RPC message.
> * SENT
> ** The payload has been sent. Waiting for a response. Call {{OutboundCall::SetCancelled()}}. Incoming response will be dropped on the floor.
> * NEGOTIATION_TIMED_OUT
> * TIMED_OUT
> * CANCELLED
> * FINISHED_NEGOTIATION_ERROR
> * FINISHED_ERROR
> * FINISHED_SUCCESS
> ** No-op. Callback has been invoked already.
> {{OutboundCall::SetCancelled()}} will mark {{status_}} to {{Cancelled}}. Set the state to {{CANCELLED}}. In addition, it will clear {{side_cars_}}, delete {{header_buf_}} and invoke the callback.
> When an OutboundCall already in the {{SENDING}} state is cancelled, {{sidecars_}} will be cleared. This provides guarantee to the RPC client that the RPC subsystem doesn't have any reference left to payload pointed by {{sidecars_}} so the RPC client can safely free them if it's passed in as a SliceSideCar. It's freed immediately if it's passed in as FastStringSideCar. An in-flight RPC will be cancelled by sending the remainder of the payload (encoded in total message length) as 0. An optional bool field is added to the request header. If it's set, it's expected that a single byte at the end which is 1 if the payload has been cancelled when it was in-flight. Potentially, we can do the same treatment for client RPC which times out already.
> {noformat}
> +------------------------------------------------+
> | Total message length (4 bytes)                 |
> +------------------------------------------------+
> | RPC Header protobuf length (variable encoding) |
> +------------------------------------------------+
> | RPC Header protobuf                            |
> +------------------------------------------------+
> | Main message length (variable encoding)        |
> +------------------------------------------------+ --- 0
> | Main message protobuf                          |
> +------------------------------------------------+ --- sidecar_offsets(0)
> | Sidecar 0                                      |
> +------------------------------------------------+ --- sidecar_offsets(1)
> | Sidecar 1                                      |
> +------------------------------------------------+ --- sidecar_offsets(2)
> | Sidecar 2                                      |
> +------------------------------------------------+ --- ...
> | ...                                            |
> +------------------------------------------------+
> | cancelled                                      | -- true if RPC has been cancelled mid-flight
> +------------------------------------------------+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)