You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2018/03/12 17:47:55 UTC

[kudu-CR] KUDU-2320 apply exponential back-off while deleting replica

Hello Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9561

to look at the new patch set (#2).

Change subject: KUDU-2320 apply exponential back-off while deleting replica
......................................................................

KUDU-2320 apply exponential back-off while deleting replica

In some scenarios, the replica to remove might be on a tablet
server which hasn't yet registered with the master.  For example,
that happens when a tablet server where the replica had been hosted
went down and stays so when master is restarted.  Such a scenario
is exercised by RaftConsensusNonVoterITest::RestartClusterWithNonVoter.

I ran the RaftConsensusNonVoterITest::RestartClusterWithNonVoter
scenario before and after the fix.  Before the fix there was a steady
high rate of messages, and after the fix the rate of messages stated
following the exponential back-off pattern.

An example of the output before the fix:
  I0309 00:07:34.972404  2029 catalog_manager.cc:2697] Scheduling retry of 832f394938da40ca954da7a842e2279b Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a with a delay of 13 ms (attempt = 0)
  W0309 00:07:34.972436  2029 catalog_manager.cc:2716] Async tablet task 832f394938da40ca954da7a842e2279b Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a failed: Not found: failed to reset TS proxy: Could not find TS for UUID 76ea4539475745e8983bab0e501d803a
  I0309 00:07:34.985633  2029 catalog_manager.cc:2697] Scheduling retry of 832f394938da40ca954da7a842e2279b Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a with a delay of 28 ms (attempt = 0)
  W0309 00:07:34.985673  2029 catalog_manager.cc:2716] Async tablet task 832f394938da40ca954da7a842e2279b Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a failed: Not found: failed to reset TS proxy: Could not find TS for UUID 76ea4539475745e8983bab0e501d803a
  I0309 00:07:35.014024  2029 catalog_manager.cc:2697] Scheduling retry of 832f394938da40ca954da7a842e2279b Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a with a delay of 26 ms (attempt = 0)
  W0309 00:07:35.014062  2029 catalog_manager.cc:2716] Async tablet task 832f394938da40ca954da7a842e2279b Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a failed: Not found: failed to reset TS proxy: Could not find TS for UUID 76ea4539475745e8983bab0e501d803a
  I0309 00:07:35.040323  2029 catalog_manager.cc:2697] Scheduling retry of 832f394938da40ca954da7a842e2279b Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a with a delay of 19 ms (attempt = 0)
  W0309 00:07:35.040377  2029 catalog_manager.cc:2716] Async tablet task 832f394938da40ca954da7a842e2279b Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a failed: Not found: failed to reset TS proxy: Could not find TS for UUID 76ea4539475745e8983bab0e501d803a
  I0309 00:07:35.059588  2029 catalog_manager.cc:2697] Scheduling retry of 832f394938da40ca954da7a842e2279b Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a with a delay of 50 ms (attempt = 0)
  W0309 00:07:35.059628  2029 catalog_manager.cc:2716] Async tablet task 832f394938da40ca954da7a842e2279b Delete Tablet RPC for TS=76ea4539475745e8983bab0e501d803a failed: Not found: failed to reset TS proxy: Could not find TS for UUID 76ea4539475745e8983bab0e501d803a

An example of the output after the fix:
  I0308 22:36:59.251387  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 37 ms (attempt = 2)
  W0308 22:36:59.251437  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:36:59.288799  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 84 ms (attempt = 3)
  W0308 22:36:59.288851  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:36:59.373152  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 146 ms (attempt = 4)
  W0308 22:36:59.373209  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:36:59.519738  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 267 ms (attempt = 5)
  W0308 22:36:59.519806  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:36:59.787600  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 554 ms (attempt = 6)
  W0308 22:36:59.787657  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:37:00.342607  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 1056 ms (attempt = 7)
  W0308 22:37:00.342682  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:37:01.399219  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 2094 ms (attempt = 8)
  W0308 22:37:01.399274  5428 catalog_manager.cc:2719] Async tablet task f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb failed: Not found: failed to reset TS proxy: Could not find TS for UUID 46d5f24c1096492d83b909cd0116edbb
  I0308 22:37:03.494213  5428 catalog_manager.cc:2700] Scheduling retry of f259598750084d1db309c1659ee818f9 Delete Tablet RPC for TS=46d5f24c1096492d83b909cd0116edbb with a delay of 4125 ms (attempt = 9)

Change-Id: Ia12d261d7270aae7fafe877780b547d262aef16d
---
M src/kudu/master/catalog_manager.cc
1 file changed, 4 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/61/9561/2
-- 
To view, visit http://gerrit.cloudera.org:8080/9561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia12d261d7270aae7fafe877780b547d262aef16d
Gerrit-Change-Number: 9561
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>