You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Adar Dembo (Jira)" <ji...@apache.org> on 2019/10/09 00:33:00 UTC

[jira] [Commented] (KUDU-2966) Make client negotiation timeouts configurable

    [ https://issues.apache.org/jira/browse/KUDU-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16947274#comment-16947274 ] 

Adar Dembo commented on KUDU-2966:
----------------------------------

Forgot to mention; here's how the delay manifested in the Java client's traces:
{noformat}
...
[10039ms] querying master
[10040ms] Sub rpc: ConnectToMaster sending RPC to server master-cdhmn002.mydomain.local:7051
[10040ms] Sub rpc: ConnectToMaster sending RPC to server master-cdhmn004.mydomain.local:7051
[10040ms] Sub rpc: ConnectToMaster sending RPC to server master-cdhmn005.mydomain.local:7051
[10050ms] Sub rpc: ConnectToMaster received from server master-cdhmn002.mydomain.local:7051 response OK
[10050ms] Sub rpc: ConnectToMaster received from server master-cdhmn005.mydomain.local:7051 response OK
[20060ms] Sub rpc: ConnectToMaster received from server master-cdhmn004.mydomain.local:7051 response Network error: [peer master-cdhmn004.mydomain.local:7051] encountered a read timeout; closing the channel
...
{noformat}

And in the C++ client:
{noformat}
W1005 08:37:49.847681 1969583 negotiation.cc:313] Failed RPC negotiation. Trace:
1005 08:37:46.846727 (+     0us) reactor.cc:583] Submitting negotiation task for client connection to 172.22.152.82:7050
1005 08:37:46.880187 (+ 33460us) negotiation.cc:98] Waiting for socket to connect
1005 08:37:46.880194 (+     7us) client_negotiation.cc:168] Beginning negotiation
1005 08:37:46.880212 (+    18us) client_negotiation.cc:245] Sending NEGOTIATE NegotiatePB request
1005 08:37:46.880378 (+   166us) client_negotiation.cc:262] Received NEGOTIATE NegotiatePB response
1005 08:37:46.880379 (+     1us) client_negotiation.cc:356] Received NEGOTIATE response from server
1005 08:37:46.880383 (+     4us) client_negotiation.cc:183] Negotiated authn=SASL
1005 08:37:46.880427 (+    44us) client_negotiation.cc:472] Sending TLS_HANDSHAKE message to server
1005 08:37:46.880428 (+     1us) client_negotiation.cc:245] Sending TLS_HANDSHAKE NegotiatePB request
1005 08:37:46.882796 (+  2368us) client_negotiation.cc:262] Received TLS_HANDSHAKE NegotiatePB response
1005 08:37:46.882797 (+     1us) client_negotiation.cc:485] Received TLS_HANDSHAKE response from server
1005 08:37:46.886664 (+  3867us) client_negotiation.cc:472] Sending TLS_HANDSHAKE message to server
1005 08:37:46.886666 (+     2us) client_negotiation.cc:245] Sending TLS_HANDSHAKE NegotiatePB request
1005 08:37:49.847411 (+2960745us) negotiation.cc:304] Negotiation complete: Network error: Client connection negotiation failed: client connection to 172.22.152.82:7050: BlockingRecv error: recv got EOF from 172.22.152.82:7050 (error 108)
Metrics: {"client-negotiator.queue_time_us":33440}
{noformat}

> Make client negotiation timeouts configurable
> ---------------------------------------------
>
>                 Key: KUDU-2966
>                 URL: https://issues.apache.org/jira/browse/KUDU-2966
>             Project: Kudu
>          Issue Type: Bug
>          Components: java, rpc
>    Affects Versions: 1.11.0
>            Reporter: Adar Dembo
>            Priority: Major
>
> We saw a cluster in the wild where some negotiation steps between endpoints were additionally delayed for some small number of seconds. The existing {{\-\-rpc_negotiation_timeout_ms}} gflag can help workaround this on servers, but there's no equivalent in clients, whose negotiation timeouts are hardcoded to 3s in the C++ client and 10s in the Java client.
> It would be nice to expose a simple API to reconfigure the negotiation timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)