You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2016/02/19 04:07:53 UTC

[kudu-CR] KUDU-1340. Fix a lock order inversion handling timed out RPCs

Hello Jean-Daniel Cryans, Adar Dembo,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/2240

to review the following change.

Change subject: KUDU-1340. Fix a lock order inversion handling timed out RPCs
......................................................................

KUDU-1340. Fix a lock order inversion handling timed out RPCs

This fixes a lock order inversion reported by TSAN with detect_deadlocks=1:

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=4232)
  Cycle in lock order graph: M11532 (0x7d440006dd88) => M10486 (0x7d4c00053fb0) => M11532

  Mutex M10486 acquired here while holding mutex M11532 in thread T6:
    #3 kudu::lock_guard<kudu::simple_spinlock>::lock_guard(kudu::simple_spinlock*) /home/todd/git/kudu/build/tsan/../../src/kudu/util/locks.h:241:5 (libtserver.so+0x000000130339)
    #4 kudu::rpc::RpcController::timeout() const /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/rpc_controller.cc:91:31 (libkrpc.so+0x0000000d4506)
    #5 kudu::rpc::OutboundCall::SetTimedOut() /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/outbound_call.cc:262:9 (libkrpc.so+0x00000009d825)

  Mutex M11532 acquired here while holding mutex M10486 in thread T132:
    #3 kudu::lock_guard<kudu::simple_spinlock>::lock_guard(kudu::simple_spinlock*) /home/todd/git/kudu/build/tsan/../../src/kudu/util/locks.h:241:5 (libtserver.so+0x000000130339)
    #4 kudu::rpc::OutboundCall::IsFinished() const /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/outbound_call.cc:274:31 (libkrpc.so+0x00000009c785)
    #5 kudu::rpc::RpcController::finished() const /home/todd/git/kudu/build/tsan/../../src/kudu/rpc/rpc_controller.cc:57:12 (libkrpc.so+0x0000000d41d1)

This would be impossible to trigger in real life, since the only context
in which we call RpcController::finished() is when CHECK()ing that
it is the case. Thus, it wouldn't ever race with a SetTimedOut() call.

However, fixing this is good hygiene, because it seems to be the only lock
order inversion reported by TSAN. With it fixed, I was able to enable
'detect_deadlocks'. As a quick test of whether the deadlock detector is
stable, I submitted 200 copies of linked_list-test:
http://dist-test.cloudera.org//job?job_id=todd.1455851252.15158

Change-Id: I882016e42462da07013a80ba87716773046cd25a
---
M build-support/run-test.sh
M src/kudu/rpc/outbound_call.cc
2 files changed, 4 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/2240/1
-- 
To view, visit http://gerrit.cloudera.org:8080/2240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I882016e42462da07013a80ba87716773046cd25a
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans