You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2017/11/21 07:27:02 UTC

[kudu-CR] maintenance manager: fix a deadlock on shutdown

Hello Mike Percy, Andrew Wong,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/8616

to review the following change.


Change subject: maintenance_manager: fix a deadlock on shutdown
......................................................................

maintenance_manager: fix a deadlock on shutdown

The shutdown sequence of the tablet server first shuts down the maintenance
manager and then calls Unregister() on the registered ops.

This produced a potential hang on shutdown, since the 'Shutdown()' call could
run at the same time that some maintenance ops were waiting on the thread_pool_
queue. Those waiting functions would be removed from the queue silently. We
depend on the functions running to decrement the 'running_' count of the associated
op, so when they were removed silently, the 'Unregister()' call could block forever
waiting for the 'running_' count to go to 0.

This caused a timeout of about 0.5% of runs of the new stop-tablet-itest
'TestShutdownWhileWriting' test case. With this fix, no runs time out.

Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b
---
M src/kudu/util/maintenance_manager.cc
1 file changed, 5 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/16/8616/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8616
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b
Gerrit-Change-Number: 8616
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] maintenance manager: fix a deadlock on shutdown

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8616 )

Change subject: maintenance_manager: fix a deadlock on shutdown
......................................................................

maintenance_manager: fix a deadlock on shutdown

The shutdown sequence of the tablet server first shuts down the maintenance
manager and then calls Unregister() on the registered ops.

This produced a potential hang on shutdown, since the 'Shutdown()' call could
run at the same time that some maintenance ops were waiting on the thread_pool_
queue. Those waiting functions would be removed from the queue silently. We
depend on the functions running to decrement the 'running_' count of the associated
op, so when they were removed silently, the 'Unregister()' call could block forever
waiting for the 'running_' count to go to 0.

This caused a timeout of about 0.5% of runs of the new stop-tablet-itest
'TestShutdownWhileWriting' test case. With this fix, no runs time out.

Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b
Reviewed-on: http://gerrit.cloudera.org:8080/8616
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy <mp...@apache.org>
---
M src/kudu/util/maintenance_manager.cc
1 file changed, 5 insertions(+), 0 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Mike Percy: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/8616
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b
Gerrit-Change-Number: 8616
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] maintenance manager: fix a deadlock on shutdown

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8616 )

Change subject: maintenance_manager: fix a deadlock on shutdown
......................................................................


Patch Set 1: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/8616
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icaf864299bfd43212bc9655f48128851b9c1d59b
Gerrit-Change-Number: 8616
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 08:09:22 +0000
Gerrit-HasComments: No