You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2017/11/23 00:16:48 UTC

[kudu-CR] make registration of maintenance ops thread-safe

Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8635


Change subject: make registration of maintenance ops thread-safe
......................................................................

make registration of maintenance ops thread-safe

Before, access to tablet replicas' maintenance ops was not thread-safe.
This was OK because the TSTabletManager externally synchronizes the
initialization and shutdown of replicas, enabling relatively lax locking
for the ops.

In the future, there will be calls that access the maintenance ops
outside of initialization and shutdown, e.g. from some external thread
to handle disk failures by canceling the affected tablets' maintenance
ops. To do so, more stringent locking is needed.

This patch uses TabletReplica's lock_ to synchronize access to the ops.

Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
---
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
2 files changed, 24 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/8635/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] make registration of maintenance ops thread-safe

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8635 )

Change subject: make registration of maintenance ops thread-safe
......................................................................

make registration of maintenance ops thread-safe

Before, access to tablet replicas' maintenance ops was not thread-safe.
This was OK because the TSTabletManager externally synchronizes the
initialization and shutdown of replicas, enabling relatively lax locking
for the ops.

In the future, there will be calls that access the maintenance ops
outside of initialization and shutdown, e.g. from some external thread
to handle disk failures by canceling the affected tablets' maintenance
ops. To do so, more stringent locking is needed.

This patch uses TabletReplica's lock_ to synchronize access to the ops.

Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Reviewed-on: http://gerrit.cloudera.org:8080/8635
Reviewed-by: Andrew Wong <aw...@cloudera.com>
Tested-by: Kudu Jenkins
---
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
2 files changed, 24 insertions(+), 13 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Kudu Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] make registration of maintenance ops thread-safe

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8635

to look at the new patch set (#3).

Change subject: make registration of maintenance ops thread-safe
......................................................................

make registration of maintenance ops thread-safe

Before, access to tablet replicas' maintenance ops was not thread-safe.
This was OK because the TSTabletManager externally synchronizes the
initialization and shutdown of replicas, enabling relatively lax locking
for the ops.

In the future, there will be calls that access the maintenance ops
outside of initialization and shutdown, e.g. from some external thread
to handle disk failures by canceling the affected tablets' maintenance
ops. To do so, more stringent locking is needed.

This patch uses TabletReplica's lock_ to synchronize access to the ops.

Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
---
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
2 files changed, 23 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/8635/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] make registration of maintenance ops thread-safe

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8635 )

Change subject: make registration of maintenance ops thread-safe
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8635/1/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/8635/1/src/kudu/tablet/tablet_replica.cc@693
PS1, Line 693: if (tablet)
why "if" ? Is it possible for RegisterMaintenanceOps() to be called after TabletReplica:;Stop() now?


http://gerrit.cloudera.org:8080/#/c/8635/1/src/kudu/tablet/tablet_replica.cc@703
PS1, Line 703: UnregisterMaintenanceOps
is it possible for multiple threads to call this function simultaneously? if so, then we could have a use-after-free



-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 23 Nov 2017 00:38:23 +0000
Gerrit-HasComments: Yes

[kudu-CR] make registration of maintenance ops thread-safe

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Mike Percy, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8635

to look at the new patch set (#4).

Change subject: make registration of maintenance ops thread-safe
......................................................................

make registration of maintenance ops thread-safe

Before, access to tablet replicas' maintenance ops was not thread-safe.
This was OK because the TSTabletManager externally synchronizes the
initialization and shutdown of replicas, enabling relatively lax locking
for the ops.

In the future, there will be calls that access the maintenance ops
outside of initialization and shutdown, e.g. from some external thread
to handle disk failures by canceling the affected tablets' maintenance
ops. To do so, more stringent locking is needed.

This patch uses TabletReplica's lock_ to synchronize access to the ops.

Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
---
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
2 files changed, 24 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/8635/4
-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] make registration of maintenance ops thread-safe

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8635

to look at the new patch set (#2).

Change subject: make registration of maintenance ops thread-safe
......................................................................

make registration of maintenance ops thread-safe

Before, access to tablet replicas' maintenance ops was not thread-safe.
This was OK because the TSTabletManager externally synchronizes the
initialization and shutdown of replicas, enabling relatively lax locking
for the ops.

In the future, there will be calls that access the maintenance ops
outside of initialization and shutdown, e.g. from some external thread
to handle disk failures by canceling the affected tablets' maintenance
ops. To do so, more stringent locking is needed.

This patch uses TabletReplica's lock_ to synchronize access to the ops.

Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
---
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
2 files changed, 24 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/35/8635/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] make registration of maintenance ops thread-safe

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8635 )

Change subject: make registration of maintenance ops thread-safe
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 23 Nov 2017 01:20:00 +0000
Gerrit-HasComments: No

[kudu-CR] make registration of maintenance ops thread-safe

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8635 )

Change subject: make registration of maintenance ops thread-safe
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8635/1/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/8635/1/src/kudu/tablet/tablet_replica.cc@693
PS1, Line 693: if (tablet)
> why "if" ? Is it possible for RegisterMaintenanceOps() to be called after T
Hrm, fair point, habit. Will remove.


http://gerrit.cloudera.org:8080/#/c/8635/1/src/kudu/tablet/tablet_replica.cc@703
PS1, Line 703: UnregisterMaintenanceOps
> is it possible for multiple threads to call this function simultaneously? i
It's not (since Stop() is single-threaded).

Just FYI, this is where else this is important: https://gerrit.cloudera.org/c/7442/12/src/kudu/tablet/tablet_replica.cc#746



-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 23 Nov 2017 00:48:01 +0000
Gerrit-HasComments: Yes

[kudu-CR] make registration of maintenance ops thread-safe

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8635 )

Change subject: make registration of maintenance ops thread-safe
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8635/2/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/8635/2/src/kudu/tablet/tablet_replica.cc@705
PS2, Line 705:   vector<MaintenanceOp*> maintenance_ops;
             :   {
             :     std::lock_guard<simple_spinlock> l(lock_);
             :     maintenance_ops = maintenance_ops_;
             :   }
             :   for (MaintenanceOp* op : maintenance_ops) {
             :     op->Unregister();
             :   }
             : 
             :   std::lock_guard<simple_spinlock> l(lock_);
             :   STLDeleteElements(&maintenance_ops_);
> in order to avoid taking the lock twice, and worrying about some kind of AB
Oooh! Clever! Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 23 Nov 2017 01:08:33 +0000
Gerrit-HasComments: Yes

[kudu-CR] make registration of maintenance ops thread-safe

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8635 )

Change subject: make registration of maintenance ops thread-safe
......................................................................


Patch Set 4: Code-Review+2

fixed tidy warning. no other changes, carrying over Mike's +2


-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 23 Nov 2017 02:49:03 +0000
Gerrit-HasComments: No

[kudu-CR] make registration of maintenance ops thread-safe

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8635 )

Change subject: make registration of maintenance ops thread-safe
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8635/2/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/8635/2/src/kudu/tablet/tablet_replica.cc@705
PS2, Line 705:   vector<MaintenanceOp*> maintenance_ops;
             :   {
             :     std::lock_guard<simple_spinlock> l(lock_);
             :     maintenance_ops = maintenance_ops_;
             :   }
             :   for (MaintenanceOp* op : maintenance_ops) {
             :     op->Unregister();
             :   }
             : 
             :   std::lock_guard<simple_spinlock> l(lock_);
             :   STLDeleteElements(&maintenance_ops_);
in order to avoid taking the lock twice, and worrying about some kind of ABA issue, how about moving ownership of the pointers to the local scope when taking the lock for the first time:

  vector<MaintenanceOp*> maintenance_ops;
  {
    std::lock_guard<simple_spinlock> l(lock_);
    maintenance_ops_.swap(maintenance_ops);
  }
  for (MaintenanceOp* op : maintenance_ops) {
    op->Unregister();
  }
  STLDeleteElements(&maintenance_ops);



-- 
To view, visit http://gerrit.cloudera.org:8080/8635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia13051fab85d0f678bd3efdcb69766d66a657cdd
Gerrit-Change-Number: 8635
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 23 Nov 2017 01:04:08 +0000
Gerrit-HasComments: Yes