You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by to...@apache.org on 2016/02/19 05:48:52 UTC
incubator-kudu git commit: KUDU-1340. Fix a lock order inversion
handling timed out RPCs
Repository: incubator-kudu
Updated Branches:
refs/heads/master 6991cd432 -> d43ec1e9d
KUDU-1340. Fix a lock order inversion handling timed out RPCs
This fixes a lock order inversion reported by TSAN with detect_deadlocks=1:
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=4232)
Cycle in lock order graph: M11532 (0x7d440006dd88) => M10486 (0x7d4c00053fb0) => M11532
Mutex M10486 acquired here while holding mutex M11532 in thread T6:
#3 kudu::lock_guard<kudu::simple_spinlock>::lock_guard(kudu::simple_spinlock*) /home/todd/git/kudu/build/tsan/../../src/kudu/util/locks.h:241:5 (libtserver.so+0x000000130339)
#4 kudu::rpc::RpcController::timeout() const /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/rpc_controller.cc:91:31 (libkrpc.so+0x0000000d4506)
#5 kudu::rpc::OutboundCall::SetTimedOut() /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/outbound_call.cc:262:9 (libkrpc.so+0x00000009d825)
Mutex M11532 acquired here while holding mutex M10486 in thread T132:
#3 kudu::lock_guard<kudu::simple_spinlock>::lock_guard(kudu::simple_spinlock*) /home/todd/git/kudu/build/tsan/../../src/kudu/util/locks.h:241:5 (libtserver.so+0x000000130339)
#4 kudu::rpc::OutboundCall::IsFinished() const /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/outbound_call.cc:274:31 (libkrpc.so+0x00000009c785)
#5 kudu::rpc::RpcController::finished() const /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/rpc_controller.cc:57:12 (libkrpc.so+0x0000000d41d1)
This would be impossible to trigger in real life, since the only context
in which we call RpcController::finished() is when CHECK()ing that
it is the case. Thus, it wouldn't ever race with a SetTimedOut() call.
However, fixing this is good hygiene, because it seems to be the only lock
order inversion reported by TSAN. With it fixed, I was able to enable
'detect_deadlocks'. As a quick test of whether the deadlock detector is
stable, I submitted 200 copies of linked_list-test:
http://dist-test.cloudera.org//job?job_id=todd.1455851252.15158
Change-Id: I882016e42462da07013a80ba87716773046cd25a
Reviewed-on: http://gerrit.cloudera.org:8080/2240
Reviewed-by: Jean-Daniel Cryans
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Todd Lipcon <to...@apache.org>
Project: http://git-wip-us.apache.org/repos/asf/incubator-kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-kudu/commit/d43ec1e9
Tree: http://git-wip-us.apache.org/repos/asf/incubator-kudu/tree/d43ec1e9
Diff: http://git-wip-us.apache.org/repos/asf/incubator-kudu/diff/d43ec1e9
Branch: refs/heads/master
Commit: d43ec1e9d5408c146c48bd4f74c1e6ccedb9611b
Parents: 6991cd4
Author: Todd Lipcon <to...@apache.org>
Authored: Thu Feb 18 18:59:26 2016 -0800
Committer: Todd Lipcon <to...@apache.org>
Committed: Fri Feb 19 04:46:08 2016 +0000
----------------------------------------------------------------------
build-support/run-test.sh | 7 -------
src/kudu/rpc/outbound_call.cc | 5 ++++-
2 files changed, 4 insertions(+), 8 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/d43ec1e9/build-support/run-test.sh
----------------------------------------------------------------------
diff --git a/build-support/run-test.sh b/build-support/run-test.sh
index 486a4a3..946af40 100755
--- a/build-support/run-test.sh
+++ b/build-support/run-test.sh
@@ -102,13 +102,6 @@ if [ -z "$ASAN_SYMBOLIZER_PATH" ]; then
fi
# Configure TSAN (ignored if this isn't a TSAN build).
-#
-# Deadlock detection (new in clang 3.5) is disabled because:
-# 1. The clang 3.5 deadlock detector crashes in some Kudu unit tests. It
-# needs compiler-rt commits c4c3dfd, 9a8efe3, and possibly others.
-# 2. Many unit tests report lock-order-inversion warnings; they should be
-# fixed before reenabling the detector.
-TSAN_OPTIONS="$TSAN_OPTIONS detect_deadlocks=0"
TSAN_OPTIONS="$TSAN_OPTIONS suppressions=$SOURCE_ROOT/build-support/tsan-suppressions.txt"
TSAN_OPTIONS="$TSAN_OPTIONS history_size=7"
TSAN_OPTIONS="$TSAN_OPTIONS external_symbolizer_path=$ASAN_SYMBOLIZER_PATH"
http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/d43ec1e9/src/kudu/rpc/outbound_call.cc
----------------------------------------------------------------------
diff --git a/src/kudu/rpc/outbound_call.cc b/src/kudu/rpc/outbound_call.cc
index e608dad..8c2ef57 100644
--- a/src/kudu/rpc/outbound_call.cc
+++ b/src/kudu/rpc/outbound_call.cc
@@ -253,13 +253,16 @@ void OutboundCall::SetFailed(const Status &status,
}
void OutboundCall::SetTimedOut() {
+ // We have to fetch timeout outside the lock to avoid a lock
+ // order inversion between this class and RpcController.
+ MonoDelta timeout = controller_->timeout();
{
lock_guard<simple_spinlock> l(&lock_);
status_ = Status::TimedOut(Substitute(
"$0 RPC to $1 timed out after $2",
remote_method_.method_name(),
conn_id_.remote().ToString(),
- controller_->timeout().ToString()));
+ timeout.ToString()));
set_state_unlocked(TIMED_OUT);
}
CallCallback();