You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by al...@apache.org on 2021/02/12 15:33:09 UTC

[kudu] branch master updated: KUDU-2612: non-zero queue size for txn-commit pool

This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git


The following commit(s) were added to refs/heads/master by this push:
     new fa5fdd8  KUDU-2612: non-zero queue size for txn-commit pool
fa5fdd8 is described below

commit fa5fdd8532efa3a634eaba1fea4552f66d881623
Author: Alexey Serbin <al...@apache.org>
AuthorDate: Thu Feb 11 23:03:20 2021 -0800

    KUDU-2612: non-zero queue size for txn-commit pool
    
    I found my new tests for the automatic txn participant registration
    often hit the following CHECK():
    
      F0212 06:49:33.448449   611 txn_status_manager.cc:371] Check failed: _s.ok() Bad status: Service unavailable: Thread pool is at capacity (10/10 tasks running, 2/0 tasks queued)
    
    As I can see, the code in
    CommitTasks::Schedule{AbortTxn,FinalizeCommit}Write() doesn't handle
    the condition of running out of the "txn-commit" pool's queue.  That
    means the pool is supposed to have an unlimited queue size, like other
    pools which exhibit similar behaviour: e.g., the pool for removing
    tablets, the pool for opening tablets, etc.  Indeed, since the number of
    concurrently opened multi-row transactions isn't limited,
    TxnStatusManager should be able handle the case when many transactions
    are being aborted and committed simultaneously.
    
    This patch removes the queue size limit for the "txn-commit" pool
    (effectively setting it to INT_MAX), mirroring the behavior of the
    "tablet-open" and "tablet-delete" pools.
    
    Change-Id: Idb3de2fd41936862eec8f2616096db16ff86c070
    Reviewed-on: http://gerrit.cloudera.org:8080/17059
    Reviewed-by: Andrew Wong <aw...@cloudera.com>
    Tested-by: Alexey Serbin <as...@cloudera.com>
---
 src/kudu/transactions/txn_status_manager.cc | 4 ++--
 src/kudu/tserver/ts_tablet_manager.cc       | 1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/kudu/transactions/txn_status_manager.cc b/src/kudu/transactions/txn_status_manager.cc
index 21807ce..33f2df8 100644
--- a/src/kudu/transactions/txn_status_manager.cc
+++ b/src/kudu/transactions/txn_status_manager.cc
@@ -365,7 +365,7 @@ void CommitTasks::AbortTxnAsync() {
 void CommitTasks::ScheduleAbortTxnWrite() {
   // Submit the task to a threadpool.
   // NOTE: This is called by the reactor thread that catches the BeginCommit
-  // reseponse, so we can't do IO in this thread.
+  // response, so we can't do IO in this thread.
   DCHECK_EQ(0, ops_in_flight_);
   scoped_refptr<CommitTasks> scoped_this(this);
   CHECK_OK(commit_pool_->Submit([this, scoped_this = std::move(scoped_this),
@@ -402,7 +402,7 @@ void CommitTasks::FinalizeCommitAsync(Timestamp commit_timestamp) {
 void CommitTasks::ScheduleFinalizeCommitWrite(Timestamp commit_timestamp) {
   // Submit the task to a threadpool.
   // NOTE: This is called by the reactor thread that catches the BeginCommit
-  // reseponse, so we can't do IO in this thread.
+  // response, so we can't do IO in this thread.
   DCHECK_EQ(0, ops_in_flight_);
   scoped_refptr<CommitTasks> scoped_this(this);
   CHECK_OK(commit_pool_->Submit([this, scoped_this = std::move(scoped_this),
diff --git a/src/kudu/tserver/ts_tablet_manager.cc b/src/kudu/tserver/ts_tablet_manager.cc
index e37df45..9e25e7c 100644
--- a/src/kudu/tserver/ts_tablet_manager.cc
+++ b/src/kudu/tserver/ts_tablet_manager.cc
@@ -376,7 +376,6 @@ Status TSTabletManager::Init() {
                 .Build(&tablet_copy_pool_));
 
   RETURN_NOT_OK(ThreadPoolBuilder("txn-commit")
-                .set_max_queue_size(0)
                 .set_max_threads(FLAGS_txn_commit_pool_num_threads)
                 .Build(&txn_commit_pool_));