You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Sahil Takiar (Code Review)" <ge...@cloudera.org> on 2019/09/17 17:24:37 UTC

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14246


Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................

IMPALA-8634: Catalog client should retry RPCs

Add retries to catalogd RPCs. Previously, connection failures triggered
a retry, but failures on the actual RPC did not trigger a retry. This
change replaces all usages of ClientCache::DoRpc() in the
CatalogOpExecutor with ClientCache::DoRpcWithRetry(). This change moves
the connection retry loop to DoRpcWithRetry(), instead of relying on the
ClientCache to retry the connection.

This patch is based to IMPALA-8904, which adds similar functionality to
statestore RPCs.

Testing:
* Renamed test_statestore_rpc_errors.py to test_services_rpc_errors.py
and added new tests for catalogd RPC errors
* Added new tests to test_restart_services.py
* Ran core tests

Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
---
M be/src/exec/catalog-op-executor.cc
M be/src/runtime/exec-env.cc
M tests/custom_cluster/test_restart_services.py
A tests/custom_cluster/test_services_rpc_errors.py
D tests/custom_cluster/test_statestore_rpc_errors.py
5 files changed, 226 insertions(+), 84 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/14246/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/exec/catalog-op-executor.cc
File be/src/exec/catalog-op-executor.cc:

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/exec/catalog-op-executor.cc@62
PS1, Line 62: static Status CatalogRpcDebugFn(int& attempt) {
nit: pass as a pointer


http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc@142
PS1, Line 142: DEFINE_int32(catalog_client_connection_num_retries, 3, "The number of times connections "
I kinda wonder if we should tweak the defaults to make it poll more frequently and/or longer. Every 10 seconds seems kinda long if the outage might be intermittent flakiness.

30 seconds also might not be long enough for the catalogd to recover (although I can see we might just want to fail queries if they're delayed 30+ seconds).



-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Sep 2019 20:43:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14246/4/tests/custom_cluster/test_services_rpc_errors.py
File tests/custom_cluster/test_services_rpc_errors.py:

http://gerrit.cloudera.org:8080/#/c/14246/4/tests/custom_cluster/test_services_rpc_errors.py@80
PS4, Line 80: Validte
> typo
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 22:50:06 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/4576/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Sep 2019 18:04:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Hello Michael Ho, Tim Armstrong, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14246

to look at the new patch set (#3).

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................

IMPALA-8634: Catalog client should retry RPCs

Add retries to catalogd RPCs. Previously, connection failures triggered
a retry, but failures on the actual RPC did not trigger a retry. This
change replaces all usages of ClientCache::DoRpc() in the
CatalogOpExecutor with ClientCache::DoRpcWithRetry(). This change moves
the connection retry loop to DoRpcWithRetry(), instead of relying on the
ClientCache to retry the connection.

This patch is based to IMPALA-8904, which adds similar functionality to
statestore RPCs.

Testing:
* Renamed test_statestore_rpc_errors.py to test_services_rpc_errors.py
and added new tests for catalogd RPC errors
* Added new tests to test_restart_services.py
* Ran core tests

Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
---
M be/src/exec/catalog-op-executor.cc
M be/src/runtime/exec-env.cc
M tests/custom_cluster/test_restart_services.py
A tests/custom_cluster/test_services_rpc_errors.py
D tests/custom_cluster/test_statestore_rpc_errors.py
5 files changed, 227 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/14246/3
-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 5: Code-Review+2

Carrying +2


-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 22:50:35 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 6
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 19 Sep 2019 18:54:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 1: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 17 Sep 2019 20:47:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/exec/catalog-op-executor.cc
File be/src/exec/catalog-op-executor.cc:

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/exec/catalog-op-executor.cc@62
PS1, Line 62: static Status CatalogRpcDebugFn(int* attempt) {
> nit: pass as a pointer
Done


http://gerrit.cloudera.org:8080/#/c/14246/2/be/src/exec/catalog-op-executor.cc
File be/src/exec/catalog-op-executor.cc:

http://gerrit.cloudera.org:8080/#/c/14246/2/be/src/exec/catalog-op-executor.cc@63
PS2, Line 63:   return (*attempt)++ == 0 ? DebugAction(FLAGS_debug_actions, "CATALOG_RPC_FIRST_ATTEMPT") :
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc@142
PS1, Line 142: DEFINE_int32(catalog_client_connection_num_retries, 3, "The number of times connections "
> I kinda wonder if we should tweak the defaults to make it poll more frequen
Yeah, 10 seconds does seem a little long. How about 10 retries, with an interval of 3 seconds? This way it still waits 30 seconds, but it just retries more often.



-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 01:41:13 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc@142
PS1, Line 142: DEFINE_int32(catalog_client_connection_num_retries, 3, "The number of times connections "
> Yeah, 10 seconds does seem a little long. How about 10 retries, with an int
Seems reasonable to me.


http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc@144
PS1, Line 144: 0
We may want to consider setting this timeout to non-zero at some point. When we convert the Catalog clients to use KRPC, we may be able to rely on using TCP_USER_TIMEOUT which seems more appropriate than socket timeout.


http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc@180
PS1, Line 180: 1, 0
May help to name these constants as variables.

IIUC, this change relies on the sleep and retry  in the retry loop of DoRpcWithRetry() so simply don't try recreating new connections on failure now.



-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 06:52:15 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/4594/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 23:32:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc@142
PS1, Line 142: DEFINE_int32(catalog_client_connection_num_retries, 3, "The number of times connections "
> Seems reasonable to me.
Done, interestingly 10 retries and a 3 second interval is what the statestore client is already using.


http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc@144
PS1, Line 144: 0
> We may want to consider setting this timeout to non-zero at some point. Whe
Sure, this patch or a separate one? Not sure what an appropriate default for this would be.


http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc@180
PS1, Line 180: 1, 0
> May help to name these constants as variables.
I added some code comments which I think makes it clearer than introducing variables like `catalog_client_connection_num_retries` and `catalog_client_rpc_retry_interval_ms`.

It still tries to recreate new connections, but the retry is driven by the DoRpcWithRetry() loop rather than the loop in ThriftClientImpl::OpenWithRetry.

The loop in DoRpcWithRetry() first tries to create a new ClientConnection, that will trigger a call to ClientCacheHelper::GetClient, which will either try to create a new connection or used a cached one. If it tries to get a new connection and the catalogd is down, the connection will fail, but the next iteration of the loop in DoRpcWithRetry() will retry to establish the connection. If GetClient finds a cached connection, the RPC will fail, and again the loop will trigger a retry.



-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 19:53:32 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Hello Michael Ho, Tim Armstrong, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14246

to look at the new patch set (#5).

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................

IMPALA-8634: Catalog client should retry RPCs

Add retries to catalogd RPCs. Previously, connection failures triggered
a retry, but failures on the actual RPC did not trigger a retry. This
change replaces all usages of ClientCache::DoRpc() in the
CatalogOpExecutor with ClientCache::DoRpcWithRetry(). This change moves
the connection retry loop to DoRpcWithRetry(), instead of relying on the
ClientCache to retry the connection.

This patch is based to IMPALA-8904, which adds similar functionality to
statestore RPCs.

Testing:
* Renamed test_statestore_rpc_errors.py to test_services_rpc_errors.py
and added new tests for catalogd RPC errors
* Added new tests to test_restart_services.py
* Ran core tests

Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
---
M be/src/exec/catalog-op-executor.cc
M be/src/runtime/exec-env.cc
M tests/custom_cluster/test_restart_services.py
A tests/custom_cluster/test_services_rpc_errors.py
D tests/custom_cluster/test_statestore_rpc_errors.py
5 files changed, 232 insertions(+), 85 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/14246/5
-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Hello Michael Ho, Tim Armstrong, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14246

to look at the new patch set (#4).

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................

IMPALA-8634: Catalog client should retry RPCs

Add retries to catalogd RPCs. Previously, connection failures triggered
a retry, but failures on the actual RPC did not trigger a retry. This
change replaces all usages of ClientCache::DoRpc() in the
CatalogOpExecutor with ClientCache::DoRpcWithRetry(). This change moves
the connection retry loop to DoRpcWithRetry(), instead of relying on the
ClientCache to retry the connection.

This patch is based to IMPALA-8904, which adds similar functionality to
statestore RPCs.

Testing:
* Renamed test_statestore_rpc_errors.py to test_services_rpc_errors.py
and added new tests for catalogd RPC errors
* Added new tests to test_restart_services.py
* Ran core tests

Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
---
M be/src/exec/catalog-op-executor.cc
M be/src/runtime/exec-env.cc
M tests/custom_cluster/test_restart_services.py
A tests/custom_cluster/test_services_rpc_errors.py
D tests/custom_cluster/test_statestore_rpc_errors.py
5 files changed, 232 insertions(+), 85 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/14246/4
-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/14246/1/be/src/runtime/exec-env.cc@144
PS1, Line 144: 0
> Sure, this patch or a separate one? Not sure what an appropriate default fo
A separate one makes sense. Please also see https://issues.apache.org/jira/browse/KUDU-2192



-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 22:33:59 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 6: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 6
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 19 Sep 2019 23:07:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/4588/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 20:32:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Sahil Takiar (Code Review)" <ge...@cloudera.org>.
Hello Michael Ho, Tim Armstrong, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14246

to look at the new patch set (#2).

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................

IMPALA-8634: Catalog client should retry RPCs

Add retries to catalogd RPCs. Previously, connection failures triggered
a retry, but failures on the actual RPC did not trigger a retry. This
change replaces all usages of ClientCache::DoRpc() in the
CatalogOpExecutor with ClientCache::DoRpcWithRetry(). This change moves
the connection retry loop to DoRpcWithRetry(), instead of relying on the
ClientCache to retry the connection.

This patch is based to IMPALA-8904, which adds similar functionality to
statestore RPCs.

Testing:
* Renamed test_statestore_rpc_errors.py to test_services_rpc_errors.py
and added new tests for catalogd RPC errors
* Added new tests to test_restart_services.py
* Ran core tests

Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
---
M be/src/exec/catalog-op-executor.cc
M be/src/runtime/exec-env.cc
M tests/custom_cluster/test_restart_services.py
A tests/custom_cluster/test_services_rpc_errors.py
D tests/custom_cluster/test_statestore_rpc_errors.py
5 files changed, 226 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/14246/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4968/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 6
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Thu, 19 Sep 2019 18:54:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/4582/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 02:20:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................

IMPALA-8634: Catalog client should retry RPCs

Add retries to catalogd RPCs. Previously, connection failures triggered
a retry, but failures on the actual RPC did not trigger a retry. This
change replaces all usages of ClientCache::DoRpc() in the
CatalogOpExecutor with ClientCache::DoRpcWithRetry(). This change moves
the connection retry loop to DoRpcWithRetry(), instead of relying on the
ClientCache to retry the connection.

This patch is based to IMPALA-8904, which adds similar functionality to
statestore RPCs.

Testing:
* Renamed test_statestore_rpc_errors.py to test_services_rpc_errors.py
and added new tests for catalogd RPC errors
* Added new tests to test_restart_services.py
* Ran core tests

Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Reviewed-on: http://gerrit.cloudera.org:8080/14246
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/catalog-op-executor.cc
M be/src/runtime/exec-env.cc
M tests/custom_cluster/test_restart_services.py
A tests/custom_cluster/test_services_rpc_errors.py
D tests/custom_cluster/test_statestore_rpc_errors.py
5 files changed, 232 insertions(+), 85 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 7
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Michael Ho (Code Review)" <ge...@cloudera.org>.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14246/4/tests/custom_cluster/test_services_rpc_errors.py
File tests/custom_cluster/test_services_rpc_errors.py:

http://gerrit.cloudera.org:8080/#/c/14246/4/tests/custom_cluster/test_services_rpc_errors.py@80
PS4, Line 80: Validte
typo



-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 22:35:10 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/4581/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 02:20:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8634: Catalog client should retry RPCs

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14246 )

Change subject: IMPALA-8634: Catalog client should retry RPCs
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14246/2/be/src/exec/catalog-op-executor.cc
File be/src/exec/catalog-op-executor.cc:

http://gerrit.cloudera.org:8080/#/c/14246/2/be/src/exec/catalog-op-executor.cc@63
PS2, Line 63:   return (*attempt)++ == 0 ? DebugAction(FLAGS_debug_actions, "CATALOG_RPC_FIRST_ATTEMPT") :
line too long (92 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/14246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f33ad2b36d301fb64e70a939e71decab0ca993c
Gerrit-Change-Number: 14246
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Michael Ho <kw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 01:39:58 +0000
Gerrit-HasComments: Yes