You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2016/02/19 04:07:53 UTC
[kudu-CR] KUDU-1340. Fix a lock order inversion handling timed out RPCs
Hello Jean-Daniel Cryans, Adar Dembo,
I'd like you to do a code review. Please visit
http://gerrit.cloudera.org:8080/2240
to review the following change.
Change subject: KUDU-1340. Fix a lock order inversion handling timed out RPCs
......................................................................
KUDU-1340. Fix a lock order inversion handling timed out RPCs
This fixes a lock order inversion reported by TSAN with detect_deadlocks=1:
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=4232)
Cycle in lock order graph: M11532 (0x7d440006dd88) => M10486 (0x7d4c00053fb0) => M11532
Mutex M10486 acquired here while holding mutex M11532 in thread T6:
#3 kudu::lock_guard<kudu::simple_spinlock>::lock_guard(kudu::simple_spinlock*) /home/todd/git/kudu/build/tsan/../../src/kudu/util/locks.h:241:5 (libtserver.so+0x000000130339)
#4 kudu::rpc::RpcController::timeout() const /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/rpc_controller.cc:91:31 (libkrpc.so+0x0000000d4506)
#5 kudu::rpc::OutboundCall::SetTimedOut() /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/outbound_call.cc:262:9 (libkrpc.so+0x00000009d825)
Mutex M11532 acquired here while holding mutex M10486 in thread T132:
#3 kudu::lock_guard<kudu::simple_spinlock>::lock_guard(kudu::simple_spinlock*) /home/todd/git/kudu/build/tsan/../../src/kudu/util/locks.h:241:5 (libtserver.so+0x000000130339)
#4 kudu::rpc::OutboundCall::IsFinished() const /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/outbound_call.cc:274:31 (libkrpc.so+0x00000009c785)
#5 kudu::rpc::RpcController::finished() const /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/rpc_controller.cc:57:12 (libkrpc.so+0x0000000d41d1)
This would be impossible to trigger in real life, since the only context
in which we call RpcController::finished() is when CHECK()ing that
it is the case. Thus, it wouldn't ever race with a SetTimedOut() call.
However, fixing this is good hygiene, because it seems to be the only lock
order inversion reported by TSAN. With it fixed, I was able to enable
'detect_deadlocks'. As a quick test of whether the deadlock detector is
stable, I submitted 200 copies of linked_list-test:
http://dist-test.cloudera.org//job?job_id=todd.1455851252.15158
Change-Id: I882016e42462da07013a80ba87716773046cd25a
---
M build-support/run-test.sh
M src/kudu/rpc/outbound_call.cc
2 files changed, 4 insertions(+), 8 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/2240/1
--
To view, visit http://gerrit.cloudera.org:8080/2240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I882016e42462da07013a80ba87716773046cd25a
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans