You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Michael Ho (Code Review)" <ge...@cloudera.org> on 2017/06/20 21:58:36 UTC

[Impala-ASF-CR] IMPALA-5537: Retry RPC on somes exceptions with SSL connection

Michael Ho has uploaded a new patch set (#2).

Change subject: IMPALA-5537: Retry RPC on somes exceptions with SSL connection
......................................................................

IMPALA-5537: Retry RPC on somes exceptions with SSL connection

After the fix for IMPALA-5388, all TSSLException thrown will be
treated as fatal error and the query will fail. Turns out that
this is too strict and in a secure cluster under load, queries
can easily hit timeout waiting for RPC response.

When running without SSL, we call RetryRpcRecv() to retry the recv
part of an RPC if the TSocket underlying the RPC gets an EAGAIN
during recv(). This change extends that logic to cover secure
connection. In particular, we pattern match against the exception
string "SSL_read: Resource temporarily unavailable" which corresponds
to EAGAIN error code being thrown in the SSL_read() path.

Similarly, we will handle closed connection in send() path with
secure connection by pattern matching against the exception string
"TTransportException: Transport not open". To verify that the exception
is thrown during the send part of a RPC call, the RPC client interface
has been augmented to take a bool* argument which is set to true after
the send part of the RPC has completed but before the recv part starts.
If DoRPC() catches an exception and the send part isn't done yet, the
entire RPC if the exception string matches certain substrings which are
safe to retry.

The fault injection utility has also been updated to distinguish between
time out and lost connection to exercise different error handling paths
in the send and recv paths.

Change-Id: I8243d4cac93c453e9396b0e24f41e147c8637b8c
---
A be/src/catalog/catalog-service-client-wrapper.h
M be/src/exec/catalog-op-executor.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-util.cc
M be/src/runtime/backend-client.h
M be/src/runtime/client-cache-types.h
M be/src/runtime/client-cache.h
M be/src/service/client-request-state.cc
A be/src/statestore/statestore-service-client-wrapper.h
A be/src/statestore/statestore-subscriber-client-wrapper.h
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/testutil/fault-injection-util.cc
M be/src/testutil/fault-injection-util.h
M tests/custom_cluster/test_rpc_exception.py
17 files changed, 375 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/7229/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8243d4cac93c453e9396b0e24f41e147c8637b8c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sa...@cloudera.com>