You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2020/10/14 17:31:56 UTC

[kudu-CR] KUDU-3149: don't block op registration on MM mutex

Hello Alexey Serbin, Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16580

to look at the new patch set (#5).

Change subject: KUDU-3149: don't block op registration on MM mutex
......................................................................

KUDU-3149: don't block op registration on MM mutex

The maintenance manager's 'lock_' is a mutex that is taken upon access
to 'ops_', notably when iterating through 'ops_' to find the best op to
run. This particular critical section can last quite a while, as finding
the best op entails computing stats for each op, which is expensive for
compactions, blocking op registration and thus tablet bootstrapping.

This patch addresses the issue by buffering calls to RegisterOp() into a
separate op map protected by a separate spinlock, and periodically
merging the separate map into 'ops_'.

I added a unit test for the maintenance manager change, and added a
single-node test to contend several tablets' bootstrap with compactions.
I ran this with and without op buffering (via a flag that I have removed
from this patch) with the following results:

[awong@va1022 release]$ for i in {1..5}; do ./bin/tablet_server-test --gtest_filter=*WithConcurrent* --num_tablets=300 --num_rowsets_per_tablet=30 --buffer_op_registration=true |& grep "waiting for all bootstraps to finish"; done
I1014 02:19:25.452993 80415 tablet_server-test.cc:4405] Time spent waiting for all bootstraps to finish: real 0.787s    user 0.000s     sys 0.002s
I1014 02:19:43.039741 82020 tablet_server-test.cc:4405] Time spent waiting for all bootstraps to finish: real 0.641s    user 0.000s     sys 0.004s
I1014 02:20:02.907203 83608 tablet_server-test.cc:4405] Time spent waiting for all bootstraps to finish: real 0.769s    user 0.001s     sys 0.002s
I1014 02:20:21.779570 85260 tablet_server-test.cc:4405] Time spent waiting for all bootstraps to finish: real 0.758s    user 0.002s     sys 0.001s
I1014 02:20:40.687155 86874 tablet_server-test.cc:4405] Time spent waiting for all bootstraps to finish: real 0.682s    user 0.001s     sys 0.002s

Average real time with op buffering: 0.727s

[awong@va1022 release]$ for i in {1..5}; do ./bin/tablet_server-test --gtest_filter=*WithConcurrent* --num_tablets=300 --num_rowsets_per_tablet=30 --buffer_op_registration=false |& grep "waiting for all bootstraps to finish"; done
I1014 02:21:13.666689 88559 tablet_server-test.cc:4405] Time spent waiting for all bootstraps to finish: real 1.538s    user 0.002s     sys 0.001s
I1014 02:21:33.119479 90172 tablet_server-test.cc:4405] Time spent waiting for all bootstraps to finish: real 1.316s    user 0.001s     sys 0.002s
I1014 02:21:53.929062 91816 tablet_server-test.cc:4405] Time spent waiting for all bootstraps to finish: real 1.393s    user 0.003s     sys 0.001s
I1014 02:22:12.764689 93439 tablet_server-test.cc:4405] Time spent waiting for all bootstraps to finish: real 1.356s    user 0.003s     sys 0.000s
I1014 02:22:31.138669 95042 tablet_server-test.cc:4405] Time spent waiting for all bootstraps to finish: real 1.516s    user 0.000s     sys 0.003s

Average real time without op buffering: 1.424s

Change-Id: I4a1b810f5b7ff6a22acc9b10b79ddffa8085c990
---
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/maintenance_manager-test.cc
M src/kudu/util/maintenance_manager.cc
M src/kudu/util/maintenance_manager.h
4 files changed, 163 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/16580/5
-- 
To view, visit http://gerrit.cloudera.org:8080/16580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4a1b810f5b7ff6a22acc9b10b79ddffa8085c990
Gerrit-Change-Number: 16580
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)