You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Henry Robinson (Code Review)" <ge...@cloudera.org> on 2017/06/14 00:31:36 UTC

[kudu-CR] KUDU-2041: Fix negotiation deadlock

Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7177

Change subject: KUDU-2041: Fix negotiation deadlock
......................................................................

KUDU-2041: Fix negotiation deadlock

With N threads in the negotiation threadpool, N or more concurrent
client negotiation attempts could starve any incoming server negotiation
tasks which used the same threadpool.

If the set of negotiation attempts forms a graph with a N cycles, the
negotiation could deadlock (at least until the negotiation timeout
expires) as all nodes in the system wait for a server request to
complete, but all nodes have dedicated all their resources to client
requests.

Fix: split the server and client tasks into two separate pools.

Testing: add a unit test which reproduces the issue, and passes with the
fix applied.

Change-Id: I38379eeaf7516d432708c2a2a285839f96c86d4f
---
M src/kudu/rpc/messenger.cc
M src/kudu/rpc/messenger.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/rpc-test-base.h
M src/kudu/rpc/rpc-test.cc
5 files changed, 61 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/77/7177/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I38379eeaf7516d432708c2a2a285839f96c86d4f
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>

[kudu-CR] KUDU-2041: Fix negotiation deadlock

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2041: Fix negotiation deadlock
......................................................................


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7177/1/src/kudu/rpc/messenger.cc
File src/kudu/rpc/messenger.cc:

PS1, Line 420:   CHECK_OK(ThreadPoolBuilder("client-negotiator")
             :               .set_min_threads(bld.min_negotiation_threads_)
             :               .set_max_threads(bld.max_negotiation_threads_)
             :               .Build(&client_negotiation_pool_));
             :   CHECK_OK(ThreadPoolBuilder("server-negotiator")
             :       .set_min_threads(bld.min_negotiation_threads_)
             :       .set_max_threads(bld.max_negotiation_threads_)
             :       .Build(&server_negotiation_pool_));
nit: mind making the indentation consistent between these two? we aren't super consistent about it cross the codebase, but at least for these two "parallel" calls it would be good to avoid the "huh? what's different here?" reaction that I had on first glance


PS1, Line 493:   if (dir == Connection::CLIENT) return client_negotiation_pool_.get();
             :   if (dir == Connection::SERVER) return server_negotiation_pool_.get();
             :  
I believe if you use a switch here with no 'default' case, then we'd get a warning if we didn't cover all the cases.


http://gerrit.cloudera.org:8080/#/c/7177/1/src/kudu/rpc/rpc-test-base.h
File src/kudu/rpc/rpc-test-base.h:

Line 530:     DoStartTestServer<GenericCalculatorService>(server_addr, enable_ssl, messenger);
> warning: parameter 'messenger' is passed by value and only copied once; con
yea, that or make it a const ref?


PS1, Line 562: messenger
std::move it?


-- 
To view, visit http://gerrit.cloudera.org:8080/7177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38379eeaf7516d432708c2a2a285839f96c86d4f
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2041: Fix negotiation deadlock

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7177

to look at the new patch set (#2).

Change subject: KUDU-2041: Fix negotiation deadlock
......................................................................

KUDU-2041: Fix negotiation deadlock

With N threads in the negotiation threadpool, N or more concurrent
client negotiation attempts could starve any incoming server negotiation
tasks which used the same threadpool.

If the set of negotiation attempts forms a graph with a N cycles, the
negotiation could deadlock (at least until the negotiation timeout
expires) as all nodes in the system wait for a server request to
complete, but all nodes have dedicated all their resources to client
requests.

Fix: split the server and client tasks into two separate pools.

Testing: add a unit test which reproduces the issue, and passes with the
fix applied.

Change-Id: I38379eeaf7516d432708c2a2a285839f96c86d4f
---
M src/kudu/rpc/messenger.cc
M src/kudu/rpc/messenger.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/rpc-test-base.h
M src/kudu/rpc/rpc-test.cc
5 files changed, 65 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/77/7177/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I38379eeaf7516d432708c2a2a285839f96c86d4f
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2041: Fix negotiation deadlock

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2041: Fix negotiation deadlock
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38379eeaf7516d432708c2a2a285839f96c86d4f
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-2041: Fix negotiation deadlock

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2041: Fix negotiation deadlock
......................................................................


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38379eeaf7516d432708c2a2a285839f96c86d4f
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-2041: Fix negotiation deadlock

Posted by "Henry Robinson (Code Review)" <ge...@cloudera.org>.
Henry Robinson has posted comments on this change.

Change subject: KUDU-2041: Fix negotiation deadlock
......................................................................


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7177/1/src/kudu/rpc/messenger.cc
File src/kudu/rpc/messenger.cc:

PS1, Line 420:   CHECK_OK(ThreadPoolBuilder("client-negotiator")
             :               .set_min_threads(bld.min_negotiation_threads_)
             :               .set_max_threads(bld.max_negotiation_threads_)
             :               .Build(&client_negotiation_pool_));
             :   CHECK_OK(ThreadPoolBuilder("server-negotiator")
             :       .set_min_threads(bld.min_negotiation_threads_)
             :       .set_max_threads(bld.max_negotiation_threads_)
             :       .Build(&server_negotiation_pool_));
> nit: mind making the indentation consistent between these two? we aren't su
Done


PS1, Line 493:   if (dir == Connection::CLIENT) return client_negotiation_pool_.get();
             :   if (dir == Connection::SERVER) return server_negotiation_pool_.get();
             :  
> I believe if you use a switch here with no 'default' case, then we'd get a 
Done. The DCHECK is still useful - I think? - because a) it's only a warning and b) gcc still warns even if all switch cases return a value that we need to return something.


http://gerrit.cloudera.org:8080/#/c/7177/1/src/kudu/rpc/rpc-test-base.h
File src/kudu/rpc/rpc-test-base.h:

Line 530:     DoStartTestServer<GenericCalculatorService>(server_addr, enable_ssl, messenger);
> yea, that or make it a const ref?
Done


Line 557:       std::shared_ptr<Messenger> messenger = nullptr) {
> warning: the parameter 'messenger' is copied for each invocation but only u
Done


PS1, Line 562: messenger
> std::move it?
Made argument a const ref, so should copy here.


-- 
To view, visit http://gerrit.cloudera.org:8080/7177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38379eeaf7516d432708c2a2a285839f96c86d4f
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2041: Fix negotiation deadlock

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-2041: Fix negotiation deadlock
......................................................................


KUDU-2041: Fix negotiation deadlock

With N threads in the negotiation threadpool, N or more concurrent
client negotiation attempts could starve any incoming server negotiation
tasks which used the same threadpool.

If the set of negotiation attempts forms a graph with a N cycles, the
negotiation could deadlock (at least until the negotiation timeout
expires) as all nodes in the system wait for a server request to
complete, but all nodes have dedicated all their resources to client
requests.

Fix: split the server and client tasks into two separate pools.

Testing: add a unit test which reproduces the issue, and passes with the
fix applied.

Change-Id: I38379eeaf7516d432708c2a2a285839f96c86d4f
Reviewed-on: http://gerrit.cloudera.org:8080/7177
Reviewed-by: Todd Lipcon <to...@apache.org>
Tested-by: Kudu Jenkins
---
M src/kudu/rpc/messenger.cc
M src/kudu/rpc/messenger.h
M src/kudu/rpc/reactor.cc
M src/kudu/rpc/rpc-test-base.h
M src/kudu/rpc/rpc-test.cc
5 files changed, 65 insertions(+), 10 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I38379eeaf7516d432708c2a2a285839f96c86d4f
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>