You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by al...@apache.org on 2020/04/14 05:47:58 UTC

[kudu] 03/03: [catalog_manager] reduce contention in ScopedLeaderSharedLock

This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch branch-1.12.x
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 1b5c143b5323e82e33e79098269b46544cf4be09
Author: Alexey Serbin <al...@apache.org>
AuthorDate: Thu Apr 9 15:25:08 2020 -0700

    [catalog_manager] reduce contention in ScopedLeaderSharedLock
    
    While troubleshooting one performance issue if running a big cluster
    with large number of tables and high rate of ConnectToMaster requests,
    in the logs I noticed many reports like the following:
    
      0323 03:59:31.091198 (+607857us) spinlock_profiling.cc:243]
      Waited 492 ms on lock 0x4cb0960. stack:
        0000000002398852
        0000000000ad8c69
        0000000000aa62ba
        000000000221aaa8
        ...
    
    which translates into
        (anonymous namespace)::SubmitSpinLockProfileData()
        master::CatalogManager::ScopedLeaderSharedLock::ScopedLeaderSharedLock()
        master::MasterServiceImpl::ConnectToMaster()
        rpc::GeneratedServiceIf::Handle()
        ...
    
    From the code it became apparent that the lock in question was
      std::lock_guard<simple_spinlock> l(catalog_->state_lock_);
    in ScopedLeaderSharedLock() constructor.
    
    As far as I can see, there is no need to access master's Raft consensus
    information (which itself might wait on its internal locks if there is
    corresponding Raft-consensus activity) under the catalog's state lock.
    
    This patch shortens the critical section with catalog's state lock held
    when constructing CatalogManager::ScopedLeaderSharedLock instance.
    
    Change-Id: I3b2e6866a8a0d5bda9e2b1f01e0668427de60868
    Reviewed-on: http://gerrit.cloudera.org:8080/15698
    Reviewed-by: Andrew Wong <aw...@cloudera.com>
    Tested-by: Kudu Jenkins
    (cherry picked from commit 14912a1fd78ba7cf4d62bf934ae64d6f6f229ee6)
    Reviewed-on: http://gerrit.cloudera.org:8080/15723
    Reviewed-by: Hao Hao <ha...@cloudera.com>
---
 src/kudu/master/catalog_manager.cc | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/src/kudu/master/catalog_manager.cc b/src/kudu/master/catalog_manager.cc
index 7d79906..0f98119 100644
--- a/src/kudu/master/catalog_manager.cc
+++ b/src/kudu/master/catalog_manager.cc
@@ -5314,12 +5314,16 @@ CatalogManager::ScopedLeaderSharedLock::ScopedLeaderSharedLock(
       initial_term_(-1) {
 
   // Check if the catalog manager is running.
-  std::lock_guard<simple_spinlock> l(catalog_->state_lock_);
-  if (PREDICT_FALSE(catalog_->state_ != kRunning)) {
-    catalog_status_ = Status::ServiceUnavailable(
-        Substitute("Catalog manager is not initialized. State: $0",
-                   StateToString(catalog_->state_)));
-    return;
+  int64_t leader_ready_term;
+  {
+    std::lock_guard<simple_spinlock> l(catalog_->state_lock_);
+    if (PREDICT_FALSE(catalog_->state_ != kRunning)) {
+      catalog_status_ = Status::ServiceUnavailable(
+          Substitute("Catalog manager is not initialized. State: $0",
+                     StateToString(catalog_->state_)));
+      return;
+    }
+    leader_ready_term = catalog_->leader_ready_term_;
   }
 
   ConsensusStatePB cstate;
@@ -5341,7 +5345,7 @@ CatalogManager::ScopedLeaderSharedLock::ScopedLeaderSharedLock(
                    uuid, SecureShortDebugString(cstate)));
     return;
   }
-  if (PREDICT_FALSE(catalog_->leader_ready_term_ != cstate.current_term() ||
+  if (PREDICT_FALSE(leader_ready_term != initial_term_ ||
                     !leader_shared_lock_.owns_lock())) {
     leader_status_ = Status::ServiceUnavailable(
         "Leader not yet ready to serve requests");