You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2018/11/28 08:27:35 UTC

[kudu-CR] client: unify leader master retry logic

Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12002


Change subject: client: unify leader master retry logic
......................................................................

client: unify leader master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable method.

The behavior is mostly the same as before with a few exceptions:
- RPC errors in that result in OK Statuses are evaluated up-front and a
  retry decision is based off of the error. It wasn't clear why this
  wasn't always the case before, since presumably we always want to
  handle RPC errors.
- GetTableLocations, on TimedOut error, will now retry for the
  single-master case.
- GetTableLocations, on ServiceUnavailable error, will not attempt to
  reconnect to the leader master, and instead, will simply retry.

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/meta_cache.cc
4 files changed, 146 insertions(+), 118 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/1
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 8: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12002/8/src/kudu/client/client.cc
File src/kudu/client/client.cc:

http://gerrit.cloudera.org:8080/#/c/12002/8/src/kudu/client/client.cc@447
PS8, Line 447:   AsyncLeaderMasterRpc<ListTabletServersRequestPB, ListTabletServersResponsePB> rpc(
Surprised you need the template args; can't they be inferred from 'req' and 'resp'?


http://gerrit.cloudera.org:8080/#/c/12002/8/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/8/src/kudu/client/master_proxy_rpc.h@24
PS8, Line 24: #include <boost/function.hpp> // IWYU pragma: keep
Could you get by with boost/function/function_fwd.hpp?



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 21 Dec 2018 19:33:37 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 6:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/client-internal.cc@159
PS6, Line 159: Status KuduClient::Data::SyncLeaderMasterRpc(
How much value does this add now? Seems like a lot of work to marshal/unmarshal args; perhaps eliminating it and forcing callers to copy the 4 lines dealing with the synchronizer would be cleaner?


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@30
PS6, Line 30: namespace boost {
            : template <typename Signature>
            : class function;
            : } // namespace boost
Interesting; I thought there'd be a boost forward-only header to include.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@62
PS6, Line 62:   // 'user_cb' will be called on the final result of the RPC (either OK,
            :   // TimedOut, or some other non-retriable error).
This should probably say something about its effects on 'resp'.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@85
PS6, Line 85:   // Resets the contents of the RPC before sending it out.
As written, this sounds like it'll send some sort of "empty" RPC on send.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@113
PS6, Line 113:   virtual bool RetryOrReconnectIfNecessary(Status* status);
             : 
             :   // Attempts to reconnect with the masters and find the leader master, and
             :   // attempts to retry the RPC.
             :   virtual void ResetMasterLeaderAndRetry(rpc::CredentialsPolicy creds_policy);
             : 
             :   // With a new leader found, resends the RPC. 'creds_policy' is the policy
             :   // with which the reconnection was attempted.
             :   virtual void NewLeaderMasterDeterminedCb(rpc::CredentialsPolicy creds_policy,
             :                                            const Status& status);
Must all of these be virtual? Only ResetMasterLeaderAndRetry is overridden AFAICT.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@125
PS6, Line 125:   const ReqClass& req_;
Style nit: we don't ever use refs as class members. Change this to a pointer.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.cc
File src/kudu/client/master_proxy_rpc.cc:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.cc@125
PS6, Line 125: this->
Nit: can drop?

Below too.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.cc@196
PS6, Line 196:   RpcRetrier* mutable_retrier = this->mutable_retrier();
Nit: just 'retrier' is probably enough.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc
File src/kudu/client/meta_cache.cc:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc@560
PS6, Line 560:   void SendRpc() override;
I found understanding this vis a vis AsyncLeaderMasterRpc.SendRpc to be extremely difficult. Likewise for SendRpcCb, though it's not nearly as bad because it doesn't chain to the superclass.

The only alternative that comes to mind is to further parameterize AsyncLeaderMasterRpc such that it calls a functor (or a PreSendRpc() member function) prior to actually sending the RPC in SendRpc. You'd still be bouncing up and down the class hierarchy in order to understand how it all works, but at least you wouldn't be dealing with overridden functions and superclass chaining.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc@1006
PS6, Line 1006:   // Make the retry for a LookupCall call LookupTabletByKey()
Not understanding this comment; where is LookupCall?



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 18 Dec 2018 01:22:57 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12002

to look at the new patch set (#6).

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced that sends
  RPCs to the leader master and handles reconnection and retries upon
  failure. LookupRpcs and SyncLeaderMasterRpc calls will now use this
  template to perform their RPCs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To maintain
  this, I've piped down an enum from Rpc to RpcRetrier to determine
  which backoff strategy to use.
- The new template and most other existing Rpc implementations
  duplicated code from Rpc::HandleResponse() to handle certain remote
  errors. As such, I've removed HandleResponse entirely and added a TODO
  to perhaps merge existing logic even further (see retriable_rpc.h).

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/integration-tests/security-unknown-tsk-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 727 insertions(+), 442 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/6
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12002

to look at the new patch set (#11).

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced that sends
  RPCs to the leader master and handles reconnection and retries upon
  failure. LookupRpcs will now use this template to perform their RPCs,
  and SyncLeaderMasterRpcs has been replaced entirely to use a
  synchronized AsyncLeaderMasterRpcs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To maintain
  this, I've piped down an enum from Rpc to RpcRetrier to determine
  which backoff strategy to use.
- The new template and most other existing Rpc implementations
  duplicated code from Rpc::HandleResponse() to handle certain remote
  errors. As such, I've removed HandleResponse entirely and added a TODO
  to perhaps merge existing logic even further (see retriable_rpc.h).

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/integration-tests/security-unknown-tsk-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 735 insertions(+), 592 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/11
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 10:

(4 comments)

Looks good to me, a few nit.

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.h@83
PS10, Line 83: asynchronous proxy RPC
I'm not sure I understand what is 'asynchronous proxy RPC'.  Could you clarify on this a bit in the comment?


http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.h@104
PS10, Line 104: in which case the
              :   //   controller status will be a non-OK error
nit: errors is always non-OK, no?  Maybe, rephrase as 'in which case the controller status will be non-OK'


http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.cc
File src/kudu/client/master_proxy_rpc.cc:

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.cc@55
PS10, Line 55: using master::AlterTableRequestPB;
             : using master::AlterTableResponsePB;
             : using master::CreateTableRequestPB;
             : using master::CreateTableResponsePB;
             : using master::DeleteTableRequestPB;
             : using master::DeleteTableResponsePB;
             : using master::IsAlterTableDoneRequestPB;
             : using master::IsAlterTableDoneResponsePB;
             : using master::IsCreateTableDoneRequestPB;
             : using master::IsCreateTableDoneResponsePB;
             : using master::GetTabletLocationsRequestPB;
             : using master::GetTabletLocationsResponsePB;
             : using master::GetTableLocationsRequestPB;
             : using master::GetTableLocationsResponsePB;
             : using master::GetTableSchemaRequestPB;
             : using master::GetTableSchemaResponsePB;
             : using master::ListMastersRequestPB;
             : using master::ListMastersResponsePB;
             : using master::ListTablesRequestPB;
             : using master::ListTablesResponsePB;
             : using master::ListTabletServersRequestPB;
             : using master::ListTabletServersResponsePB;
             : using master::MasterServiceProxy;
             : using master::MasterErrorPB;
             : using master::ReplaceTabletRequestPB;
             : using master::ReplaceTabletResponsePB;
             : using rpc::BackoffType;
             : using rpc::CredentialsPolicy;
             : using rpc::ErrorStatusPB;
             : using rpc::ResponseCallback;
             : using rpc::Rpc;
             : using rpc::RpcController;
             : using std::string;
             : using std::vector;
             : using strings::Substitute;
nit: could you move this out of the namespaces enclosure?


http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/rpc/rpc.h
File src/kudu/rpc/rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/rpc/rpc.h@187
PS10, Line 187: BackoffType backoff_;
nit: add const?



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 02 Jan 2019 21:44:22 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced that sends
  RPCs to the leader master and handles reconnection and retries upon
  failure. LookupRpcs will now use this template to perform their RPCs,
  and SyncLeaderMasterRpcs has been replaced entirely to use a
  synchronized AsyncLeaderMasterRpcs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To maintain
  this, I've piped down an enum from Rpc to RpcRetrier to determine
  which backoff strategy to use.
- The new template and most other existing Rpc implementations
  duplicated code from Rpc::HandleResponse() to handle certain remote
  errors. As such, I've removed HandleResponse entirely and added a TODO
  to perhaps merge existing logic even further (see retriable_rpc.h).

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Reviewed-on: http://gerrit.cloudera.org:8080/12002
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/integration-tests/security-unknown-tsk-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 735 insertions(+), 592 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 14
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12002

to look at the new patch set (#4).

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced sends RPCs to
  the master and handles reconnection and retries upon failure.
  LookupRpcs and SyncLeaderMasterRpc calls will now use this class to
  perform their RPCs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs use exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To work around
  this, I've extended RpcRetrier with an ExponentialRpcRetrier class
  that computes an exponential backoff. The RpcRetrier type is a
  template parameter to the AsyncLeaderMasterRpc.

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/master_rpc.h
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 701 insertions(+), 390 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/4
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@62
PS6, Line 62:   //
            :   // 'rpc_name' is a descriptor for the RPC used t
> Done, assuming you're referring to the class itself, rather than 'user_cb' 
Yeah, I meant something along the lines of "When user_cb is called, resp will have been populated with ..." or some such.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc
File src/kudu/client/meta_cache.cc:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc@560
PS6, Line 560: 
> While I agree the following the callbacks in general is hard, I'd like to p
Right, I guess what I'm advocating for is that SendRpcSlowPath() is renamed to SendRpc(), and the check of whether or not we should actually send the RPC is defined in something like a PreSendRpc(). Think of it as a "hook" that subclasses can customize to have additional effects on the RPC's flow, which should mitigate some of the mangling done to core methods like SendRpc().

You could imagine there being additional hooks in other interesting places. For example, maybe there's a hook that affects if/how an RPC is retried, either after SendRpcCb, or after the delayed task fires. The point is to get away from superclass chaining and method overriding and towards a flow that's just a "union" of methods, some of which are no-op for certain subclasses. I think that's a simpler mental model to follow: you can get a good sense for how it all works by just reading through the superclass and taking note of the possible scope of each hook along the way.

BTW, it's also not fundamentally different than what you have here; it's just taking certain subclass overrides and defining them as hooks rather than overrides. That way when you're reading through the code, you take an additive approach when thinking about the hooks rather than a "replacative" one (as you would with overrides, though chaining messes that up of course).

Anyway if you think that's too challenging to define and want to punt, that's fine. But I do want you to understand what it is I'm suggesting, so you can consider it or push back if you think I'm wrong. Happy to continue the discussion on Slack if it'd help.



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 21 Dec 2018 17:35:28 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc
File src/kudu/client/meta_cache.cc:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc@560
PS6, Line 560:   string ToString() const override;
> Right, I guess what I'm advocating for is that SendRpcSlowPath() is renamed
To close the loop on this, we chatted about this in #kudu-general. We batted around this suggestion wrt what it would look like and whether it'd be worth it to refactor AsyncLeaderMasterRpc to make LookupRpc feel more natural, and the conclusion settled on was that the answer is no, given that other/future RPCs likely don't look much like LookupRpc in terms of pre-checks and whatnot. Notably, there's a meta-cache-check, as well as a permit acquisition bundled into LookupRpc, and it's not easy to model this with the addition of a single PreSendRpc() hook, given the current usages.



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 22 Dec 2018 02:57:30 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 6:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/12002/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12002/5//COMMIT_MSG@27
PS5, Line 27: - SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
            :   the existing RpcRetrier implemented its own backoff. To maintain
            :   this, I've piped down an enum from Rpc to RpcRetrier to determine
            :   which backoff strategy to use.
            : - The new template and most other existing Rpc im
> I'm not convinced that both kinds of backoff are warranted, but let's prete
It's tricky doing it with a template parameter without inheritance. In some form, I think the templates would need to refer to some single Rpc or RpcRetrier base class, since the original classes point to each other.

I've gone about piping down an enum to the constructors.


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/client-internal.cc@162
PS5, Line 162:     const ReqClass& req,
> nit: extra space.
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h@51
PS5, Line 51: // reconnection to the master(s).
> Doc the class too.
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h@53
PS5, Line 53: class AsyncLeaderMasterRpc : public rpc::Rpc {
> Lots of args, please doc.
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h@55
PS5, Line 55:   // The input 'client' will be used to call the asynchonous master proxy
            :   // function 'func' on the currently-k
> Can the AsyncLeaderMasterRpc declare its own req/resp members and provide a
It's important so this can be used to service proxy requests. I suppose you could `*resp = rpc.resp()` if anything, but this seems more natural, and more flexible (e.g. LookupRpc points the AsyncLeaderMasterRpc at internal members, and SyncLeaderMasterRpc can utilize the proxy input parameters).


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h@95
PS5, Line 95: RPC controller
> nit: doc what the 'status' out param means?
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h@122
PS5, Line 122:              const Stat
> nit: doc what this is for?
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc
File src/kudu/client/master_proxy_rpc.cc:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc@92
PS5, Line 92: 
> nit: extra space.
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc@196
PS5, Line 196: table_retrier = this->
> readability nit: here and below, maybe store the pointer to the retrier int
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc@217
PS5, Line 217: n't yet been updated wit
> readability nit: since the multi-master property is a not dynamic one, mayb
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc@281
PS5, Line 281: 
> For consistency, can you format all of these the same way?
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/meta_cache.cc
File src/kudu/client/meta_cache.cc:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/meta_cache.cc@20
PS5, Line 20: cstdint>
> nit: <cstdint> ?
Ok, but I'm pretty sure this is what IWYU wanted.


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/meta_cache.cc@82
PS5, Line 82: using tserver::TabletServerServiceProxy;
> warning: using decl 'Messenger' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/meta_cache.cc@563
PS5, Line 563: void Sen
> nit here and below for virtual methods marked with 'override': drop 'virtua
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/meta_cache.cc@731
PS5, Line 731:       // One or more of the tablets is not running. Retry after some time.
> Can you explain a bit more here why RetryOrReconnectIfNecessary() cannot ha
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/retriable_rpc.h
File src/kudu/rpc/retriable_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/retriable_rpc.h@24
PS5, Line 24: #include "kudu/rpc/messenger.h"
> warning: #includes are not sorted properly [llvm-include-order]
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/retriable_rpc.h@51
PS5, Line 51: // TODO(awong): there might be room to merge part of this with retry logic in
> Hard to reason about RetriableRpc vis a vis RetryingRpc.
Done


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/rpc.h
File src/kudu/rpc/rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/rpc.h@115
PS5, Line 115: 
> Could we merge the retrier into the Rpc class? I think the decomposition ha
While I agree that the RpcRetrier abstraction made parts of this change not ideal, I'm going to punt on this, given how deeply-plumbed it is already. I'd like to keep this change targeted to what it's already got.

Though good point about HandleResponse(); I've taken it out it since it was redundant with logic in AsyncLeaderMasterRpc (and elsewhere) and was only used in one other place, AFAICT. It was kind of ruining the "new" way of life anyway, where handling is done in the Rpc implementations, rather than the base Rpc class.


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/rpc.h@229
PS5, Line 229:   DISALLOW_COPY_AND_ASSIGN(Rpc);
> It's tough to see the distinction between this class and Rpc. Both expose a
Yeah, I'd stubbornly wanted to templatize everything, and couldn't get around it without this abstract class (unless I went about tearing up further Rpc and RpcRetrier). I've restored the original class hierarchy and piped down an enum to dictate the backoff strategy.



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 17 Dec 2018 18:10:47 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12002

to look at the new patch set (#9).

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced that sends
  RPCs to the leader master and handles reconnection and retries upon
  failure. LookupRpcs will now use this template to perform their RPCs,
  and SyncLeaderMasterRpcs has been replaced entirely to use a
  synchronized AsyncLeaderMasterRpcs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To maintain
  this, I've piped down an enum from Rpc to RpcRetrier to determine
  which backoff strategy to use.
- The new template and most other existing Rpc implementations
  duplicated code from Rpc::HandleResponse() to handle certain remote
  errors. As such, I've removed HandleResponse entirely and added a TODO
  to perhaps merge existing logic even further (see retriable_rpc.h).

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/integration-tests/security-unknown-tsk-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 743 insertions(+), 583 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/9
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12002

to look at the new patch set (#8).

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced that sends
  RPCs to the leader master and handles reconnection and retries upon
  failure. LookupRpcs will now use this template to perform their RPCs,
  and SyncLeaderMasterRpcs has been replaced entirely to use a
  synchronized AsyncLeaderMasterRpcs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To maintain
  this, I've piped down an enum from Rpc to RpcRetrier to determine
  which backoff strategy to use.
- The new template and most other existing Rpc implementations
  duplicated code from Rpc::HandleResponse() to handle certain remote
  errors. As such, I've removed HandleResponse entirely and added a TODO
  to perhaps merge existing logic even further (see retriable_rpc.h).

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/integration-tests/security-unknown-tsk-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 740 insertions(+), 581 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/8
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12002

to look at the new patch set (#13).

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced that sends
  RPCs to the leader master and handles reconnection and retries upon
  failure. LookupRpcs will now use this template to perform their RPCs,
  and SyncLeaderMasterRpcs has been replaced entirely to use a
  synchronized AsyncLeaderMasterRpcs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To maintain
  this, I've piped down an enum from Rpc to RpcRetrier to determine
  which backoff strategy to use.
- The new template and most other existing Rpc implementations
  duplicated code from Rpc::HandleResponse() to handle certain remote
  errors. As such, I've removed HandleResponse entirely and added a TODO
  to perhaps merge existing logic even further (see retriable_rpc.h).

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/integration-tests/security-unknown-tsk-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 735 insertions(+), 592 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/13
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 13
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Hao Hao (Code Review)" <ge...@cloudera.org>.
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/client-internal.cc@162
PS5, Line 162: 
nit: extra space.


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h@95
PS5, Line 95: Status* status
nit: doc what the 'status' out param means?


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h@122
PS5, Line 122: required_feature_flags_
nit: doc what this is for?


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc
File src/kudu/client/master_proxy_rpc.cc:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc@92
PS5, Line 92: 
nit: extra space.


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/meta_cache.cc
File src/kudu/client/meta_cache.cc:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/meta_cache.cc@731
PS5, Line 731:     if (new_status.IsServiceUnavailable()) {
Can you explain a bit more here why RetryOrReconnectIfNecessary() cannot handle this case?



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 14 Dec 2018 21:23:42 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12002

to look at the new patch set (#10).

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced that sends
  RPCs to the leader master and handles reconnection and retries upon
  failure. LookupRpcs will now use this template to perform their RPCs,
  and SyncLeaderMasterRpcs has been replaced entirely to use a
  synchronized AsyncLeaderMasterRpcs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To maintain
  this, I've piped down an enum from Rpc to RpcRetrier to determine
  which backoff strategy to use.
- The new template and most other existing Rpc implementations
  duplicated code from Rpc::HandleResponse() to handle certain remote
  errors. As such, I've removed HandleResponse entirely and added a TODO
  to perhaps merge existing logic even further (see retriable_rpc.h).

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/integration-tests/security-unknown-tsk-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 741 insertions(+), 591 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/10
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12002

to look at the new patch set (#12).

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced that sends
  RPCs to the leader master and handles reconnection and retries upon
  failure. LookupRpcs will now use this template to perform their RPCs,
  and SyncLeaderMasterRpcs has been replaced entirely to use a
  synchronized AsyncLeaderMasterRpcs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To maintain
  this, I've piped down an enum from Rpc to RpcRetrier to determine
  which backoff strategy to use.
- The new template and most other existing Rpc implementations
  duplicated code from Rpc::HandleResponse() to handle certain remote
  errors. As such, I've removed HandleResponse entirely and added a TODO
  to perhaps merge existing logic even further (see retriable_rpc.h).

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/integration-tests/security-unknown-tsk-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 735 insertions(+), 592 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/12
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 12
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 13:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.h@83
PS10, Line 83: 
> I'm not sure I understand what is 'asynchronous proxy RPC'.  Could you clar
Reworded for clarity, it's just using the master proxy's asynchronous API.


http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.h@104
PS10, Line 104: StatusPB
              :   //
> nit: errors is always non-OK, no?  Maybe, rephrase as 'in which case the co
Done


http://gerrit.cloudera.org:8080/#/c/12002/2/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/2/src/kudu/client/master_proxy_rpc.h@58
PS2, Line 58: // 'use
> nit: here and below for the 'virtual' and 'override': leave just 'override'
Done


http://gerrit.cloudera.org:8080/#/c/12002/2/src/kudu/client/master_proxy_rpc.h@100
PS2, Line 100:   //   controller status will be non-OK
             :   // - generic RPC errors, in which case the controller status will be a
             :   //   RemoteError and the controller will have an error response
             :   // - generic Master application errors, in which case the contro
> nit: is it possible to use std::function here instead?  Not that it's cruci
Done


http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.cc
File src/kudu/client/master_proxy_rpc.cc:

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.cc@55
PS10, Line 55: using master::DeleteTableRequestPB;
             : using master::DeleteTableResponsePB;
             : using master::IsAlterTableDoneRequestPB;
             : using master::IsAlterTableDoneResponsePB;
             : using master::IsCreateTableDoneRequestPB;
             : using master::IsCreateTableDoneResponsePB;
             : using master::GetTabletLocationsRequestPB;
             : using master::GetTabletLocationsResponsePB;
             : using master::GetTableLocationsRequestPB;
             : using master::GetTableLocationsResponsePB;
             : using master::GetTableSchemaRequestPB;
             : using master::GetTableSchemaResponsePB;
             : using master::ListMastersRequestPB;
             : using master::ListMastersResponsePB;
             : using master::ListTablesRequestPB;
             : using master::ListTablesResponsePB;
             : using master::ListTabletServersRequestPB;
             : using master::ListTabletServersResponsePB;
             : using master::MasterServiceProxy;
             : using master::MasterErrorPB;
             : using master::ReplaceTabletRequestPB;
             : using master::ReplaceTabletResponsePB;
             : using rpc::BackoffType;
             : using rpc::CredentialsPolicy;
             : using rpc::ErrorStatusPB;
             : using rpc::ResponseCallback;
             : using rpc::Rpc;
             : using rpc::RpcController;
             : 
             : namespace client {
             : namespace internal {
             : 
             : template <class ReqClass, class RespClass>
             : AsyncLeaderMasterRpc<ReqClass, RespClass>::AsyncLeaderMasterRpc(
             :     const MonoTime& deadli
> nit: could you move this out of the namespaces enclosure?
Done


http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/rpc/rpc.h
File src/kudu/rpc/rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/rpc/rpc.h@187
PS10, Line 187: const BackoffType bac
> nit: add const?
Done


http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/rpc/rpc.cc
File src/kudu/rpc/rpc.cc:

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/rpc/rpc.cc@32
PS10, Line 32: using std::string;
> warning: using decl 'shared_ptr' is unused [misc-unused-using-decls]
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 13
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 02 Jan 2019 22:05:24 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12002/8/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/12002/8/src/kudu/client/client-internal.cc@97
PS8, Line 97: using master::MasterFeatures;
> warning: using decl 'MasterErrorPB' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/12002/8/src/kudu/client/client.cc
File src/kudu/client/client.cc:

http://gerrit.cloudera.org:8080/#/c/12002/8/src/kudu/client/client.cc@447
PS8, Line 447:   AsyncLeaderMasterRpc<ListTabletServersRequestPB, ListTabletServersResponsePB> rpc(
> Surprised you need the template args; can't they be inferred from 'req' and
I was surprised too, but apparently this doesn't work for class templates until C++17:
https://en.cppreference.com/w/cpp/language/class_template_argument_deduction


http://gerrit.cloudera.org:8080/#/c/12002/8/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/8/src/kudu/client/master_proxy_rpc.h@24
PS8, Line 24: #include "kudu/rpc/response_callback.h"
> Could you get by with boost/function/function_fwd.hpp?
Seems like no, IWYU keeps complaining, even with this. Reverting to just use the forward decls, which we do have sprinkled around.



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 21 Dec 2018 22:59:37 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 7:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/client-internal.cc
File src/kudu/client/client-internal.cc:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/client-internal.cc@98
PS6, Line 98: using master::ReplaceTabletRequestPB;
> warning: using decl 'ReplaceTabletRequestPB' is unused [misc-unused-using-d
Not sure why this is complaining. This is used.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/client-internal.cc@99
PS6, Line 99: using master::ReplaceTabletResponsePB;
> warning: using decl 'ReplaceTabletResponsePB' is unused [misc-unused-using-
Ditto.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/client-internal.cc@159
PS6, Line 159:     : hive_metastore_sasl_enabled_(false),
> How much value does this add now? Seems like a lot of work to marshal/unmar
Done


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@30
PS6, Line 30: #include "kudu/util/status_callback.h"
            : 
            : namespace kudu {
            : 
> Interesting; I thought there'd be a boost forward-only header to include.
Indeed, IWYU flagged this. I'll revert and see what happens.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@62
PS6, Line 62:   //
            :   // 'rpc_name' is a descriptor for the RPC used t
> This should probably say something about its effects on 'resp'.
Done, assuming you're referring to the class itself, rather than 'user_cb' specifically, which doesn't have any effect on 'resp'.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@85
PS6, Line 85: 
> As written, this sounds like it'll send some sort of "empty" RPC on send.
Done


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@113
PS6, Line 113:   // Attempts to reconnect with the masters and find the leader master, and
             :   // attempts to retry the RPC.
             :   virtual void ResetMasterLeaderAndRetry(rpc::CredentialsPolicy creds_policy);
             : 
             :   // With a new leader found, resends the RPC. 'creds_policy' is the policy
             :   // with which the reconnection was attempted.
             :   void NewLeaderMasterDeterminedCb(rpc::CredentialsPolicy creds_policy,
             :                                    const Status& status);
             : 
             :   KuduClient* client_;
> Must all of these be virtual? Only ResetMasterLeaderAndRetry is overridden 
Done


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.h@125
PS6, Line 125: 
> Style nit: we don't ever use refs as class members. Change this to a pointe
Done


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.cc
File src/kudu/client/master_proxy_rpc.cc:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.cc@81
PS6, Line 81: using rpc::BackoffType;
> warning: using decl 'SecureDebugString' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.cc@125
PS6, Line 125: 
> Nit: can drop?
Done


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_proxy_rpc.cc@196
PS6, Line 196:   const bool is_multi_master = client_->IsMultiMaster();
> Nit: just 'retrier' is probably enough.
In some places; I was using this for DelayedRetry() too, but it's simple enough that I'll just pull it out.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_rpc.cc
File src/kudu/client/master_rpc.cc:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/master_rpc.cc@65
PS6, Line 65: using strings::Substitute;
> warning: using decl 'RpcRetrier' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc
File src/kudu/client/meta_cache.cc:

http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc@81
PS6, Line 81: using tserver::TabletServerServiceProxy;
> warning: using decl 'RpcRetrier' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc@560
PS6, Line 560: 
> I found understanding this vis a vis AsyncLeaderMasterRpc.SendRpc to be ext
While I agree the following the callbacks in general is hard, I'd like to punt on this. I don't want to crowd AsyncLeaderMasterRpc, particularly for the sake of this subclass.

I'm also not convinced that a pre-check would simplify things a whole lot either, though maybe I misunderstood. Do you mean putting the cache check into a PreSendRpc()? If so, I don't see how it would help because IIUC, the difficulty stems from the fact that the division of work is set by the public API, which was implemented to avoid Rpc allocation; i.e. we have SendRpc() and SendRpcSlowPath(), that differ only by the cache check, which in the first "call" will be avoided without allocating the Rpc. It seems like an explicit PreSendRpc() wouldn't change that.


http://gerrit.cloudera.org:8080/#/c/12002/6/src/kudu/client/meta_cache.cc@1006
PS6, Line 1006:   rpc->SendRpcSlowPath();
> Not understanding this comment; where is LookupCall?
Oops, left in from an old, unpublished revision.



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 21 Dec 2018 08:56:40 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12002

to look at the new patch set (#5).

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced that sends
  RPCs to the leader master and handles reconnection and retries upon
  failure. LookupRpcs and SyncLeaderMasterRpc calls will now use this
  template to perform their RPCs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To maintain
  this, I've extended RpcRetrier with an ExponentialRpcRetrier class
  that computes an exponential backoff. The RpcRetrier type is a
  template parameter to the AsyncLeaderMasterRpc.

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/master_rpc.h
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 703 insertions(+), 401 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/5
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 5:

(6 comments)

just nits at this point from me

http://gerrit.cloudera.org:8080/#/c/12002/2/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/2/src/kudu/client/master_proxy_rpc.h@58
PS2, Line 58:        
nit: here and below for the 'virtual' and 'override': leave just 'override' since those are overrides and that implies 'virtual'.


http://gerrit.cloudera.org:8080/#/c/12002/2/src/kudu/client/master_proxy_rpc.h@100
PS2, Line 100: 
             :   // With a new leader found, resends the RPC. 'creds_policy' is the policy
             :   // with which the reconnection was attempted.
             :   virtual void NewLeaderMasterDeterminedCb(rpc::CredentialsPolicy 
nit: is it possible to use std::function here instead?  Not that it's crucial, but since you are modifying this code in some non-trivial manner anyways, it might make sense to convert boost::bind() and boost::function to their STL counterparts.


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc
File src/kudu/client/master_proxy_rpc.cc:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc@196
PS5, Line 196: this->mutable_retrier(
readability nit: here and below, maybe store the pointer to the retrier into a local variable and reuse it in the code of this method?


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc@217
PS5, Line 217: client_->IsMultiMaster()
readability nit: since the multi-master property is a not dynamic one, maybe store it into a boolean variable and use it here and below


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/meta_cache.cc
File src/kudu/client/meta_cache.cc:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/meta_cache.cc@20
PS5, Line 20: stdint.h
nit: <cstdint> ?


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/meta_cache.cc@563
PS5, Line 563: virtual 
nit here and below for virtual methods marked with 'override': drop 'virtual'



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 13 Dec 2018 03:38:50 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 13: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 13
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 02 Jan 2019 22:05:47 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 5:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/12002/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12002/5//COMMIT_MSG@27
PS5, Line 27: - SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
            :   the existing RpcRetrier implemented its own backoff. To maintain
            :   this, I've extended RpcRetrier with an ExponentialRpcRetrier class
            :   that computes an exponential backoff. The RpcRetrier type is a
            :   template parameter to the AsyncLeaderMasterRpc.
I'm not convinced that both kinds of backoff are warranted, but let's pretend that they are: could you implement this without inheritance? Perhaps as an enum passed as to the RpcRetrier constructor? Or an enum used as a template parameter?


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h@51
PS5, Line 51: class AsyncLeaderMasterRpc : public rpc::RetryingRpc<Retrier> {
Doc the class too.


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h@53
PS5, Line 53:   AsyncLeaderMasterRpc(const MonoTime& deadline,
Lots of args, please doc.


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.h@55
PS5, Line 55:                        const ReqClass& req,
            :                        RespClass* resp,
Can the AsyncLeaderMasterRpc declare its own req/resp members and provide accessors to them? Or is it important to be able to provide them as args here?


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc
File src/kudu/client/master_proxy_rpc.cc:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/client/master_proxy_rpc.cc@281
PS5, Line 281: template
For consistency, can you format all of these the same way?


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/retriable_rpc.h
File src/kudu/rpc/retriable_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/retriable_rpc.h@51
PS5, Line 51: class RetriableRpc : public RetryingRpc<RpcRetrier> {
Hard to reason about RetriableRpc vis a vis RetryingRpc.


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/rpc.h
File src/kudu/rpc/rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/rpc.h@115
PS5, Line 115: class RpcRetrier {
Could we merge the retrier into the Rpc class? I think the decomposition has been a net loss: the retrier isn't particular complex, but there's a great deal of plumbing in it (and in its consumers) to facilitate the exchange across class boundaries.

In particular I'm not so sure HandleResponse() has worked out; the depth of these class hierarchies leads to RPC "handling" at multiple levels that's hard to follow.


http://gerrit.cloudera.org:8080/#/c/12002/5/src/kudu/rpc/rpc.h@229
PS5, Line 229: // An RPC that gets retried based on some retrying logic.
It's tough to see the distinction between this class and Rpc. Both expose a retrier/mutable_retrier, and although Rpc doesn't mandate where it comes from, where else would it come from but a class member? Likewise, the distinction between the base class' stipulation about num_attempts and the override seem almost immaterial.

If the goal here is to provide a pure interface and a partial implementation, I don't think that's warranted in C++ where partial interfaces are a thing (by contrast, Java requires interfaces to be pure). I think it'd be OK for Rpc to provide some base implementations to avoid the extra class in the hierarchy.



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 13 Dec 2018 01:00:59 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 13: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.h
File src/kudu/client/master_proxy_rpc.h:

http://gerrit.cloudera.org:8080/#/c/12002/10/src/kudu/client/master_proxy_rpc.h@83
PS10, Line 83: 
> Reworded for clarity, it's just using the master proxy's asynchronous API.
Thank you!



-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 13
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 03 Jan 2019 00:36:59 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12002 )

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................


Patch Set 10: Code-Review+2

Check in with Alexey to see if he wants to rereview?


-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 02 Jan 2019 17:29:23 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-683: unify C++ client-to-master retry logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12002

to look at the new patch set (#7).

Change subject: KUDU-683: unify C++ client-to-master retry logic
......................................................................

KUDU-683: unify C++ client-to-master retry logic

In preparation for implementing an asynchronous RPC that will need to
connect to the leader master, I've been looking through some of the
retry logic we have scattered around the C++ client. I noticed
near-identical code in a couple places (LookupRpc and
SyncLeaderMasterRpc), so I've ripped it out and put it in its own
reusable class.

A few things are bundled into this patch. The big pieces are:
- A new AsyncLeaderMasterRpc template class is introduced that sends
  RPCs to the leader master and handles reconnection and retries upon
  failure. LookupRpcs will now use this template to perform their RPCs,
  and SyncLeaderMasterRpcs has been replaced entirely to use a
  synchronized AsyncLeaderMasterRpcs.
- In implementing the template, I unified the retry logic to be used for
  the above existing RPCs; the bulk of it was overlapped, but some logic
  existed specifically for LookupRpc. See the comment above
  RetryOrReconnectIfNecessary in master_proxy_rpc.h and the impl for
  LookupRpc::SendRpcCb in meta_cache.cc for more details. For the most
  part, there aren't functional changes.
- SyncLeaderMasterRpcs used exponential backoff when retrying RPCs, but
  the existing RpcRetrier implemented its own backoff. To maintain
  this, I've piped down an enum from Rpc to RpcRetrier to determine
  which backoff strategy to use.
- The new template and most other existing Rpc implementations
  duplicated code from Rpc::HandleResponse() to handle certain remote
  errors. As such, I've removed HandleResponse entirely and added a TODO
  to perhaps merge existing logic even further (see retriable_rpc.h).

Change-Id: I2450676da1c723a247c84deb1b895f116173670e
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
A src/kudu/client/master_proxy_rpc.cc
A src/kudu/client/master_proxy_rpc.h
M src/kudu/client/master_rpc.cc
M src/kudu/client/meta_cache.cc
M src/kudu/client/scanner-internal.cc
M src/kudu/integration-tests/master-stress-test.cc
M src/kudu/integration-tests/replace_tablet-itest.cc
M src/kudu/integration-tests/security-unknown-tsk-itest.cc
M src/kudu/rpc/retriable_rpc.h
M src/kudu/rpc/rpc.cc
M src/kudu/rpc/rpc.h
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_master.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_action_tserver.cc
22 files changed, 740 insertions(+), 574 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/12002/7
-- 
To view, visit http://gerrit.cloudera.org:8080/12002
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2450676da1c723a247c84deb1b895f116173670e
Gerrit-Change-Number: 12002
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)