You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Will Berkeley (Code Review)" <ge...@cloudera.org> on 2017/09/15 17:12:33 UTC

[kudu-CR] Fix tablet state metrics race

Will Berkeley has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8082

Change subject: Fix tablet state metrics race
......................................................................

Fix tablet state metrics race

6fef9ca5c9d1dcf571795c514cb9fecc493d0714 added tablet state summary
metrics by adding a struct with cached counts that was periodically
refreshed by walking the tablet manager, but also added a race
between reading the cached counts and refreshing them. This fixes the
race by centralizing reading and writing the counts in the tablet
manager and protecting them with the tablet manager's lock.

Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
---
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/ts_tablet_manager_metrics.cc
M src/kudu/tserver/ts_tablet_manager_metrics.h
4 files changed, 141 insertions(+), 151 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/8082/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>

[kudu-CR] Fix tablet state metrics race

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/8082

to look at the new patch set (#4).

Change subject: Fix tablet state metrics race
......................................................................

Fix tablet state metrics race

6fef9ca5c9d1dcf571795c514cb9fecc493d0714 added tablet state summary
metrics by adding a struct with cached counts that was periodically
refreshed by walking the tablet manager, but also added a race
between reading the cached counts and refreshing them. This fixes the
race by centralizing reading and writing the counts in the tablet
manager and protecting them with the tablet manager's lock.

Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
---
M src/kudu/tserver/CMakeLists.txt
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
D src/kudu/tserver/ts_tablet_manager_metrics.cc
D src/kudu/tserver/ts_tablet_manager_metrics.h
5 files changed, 106 insertions(+), 273 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/8082/4
-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Fix tablet state metrics race

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: Fix tablet state metrics race
......................................................................


Patch Set 4:

> lgtm, but is there a test you can reuse/easily add to?

The original patch (https://gerrit.cloudera.org/#/c/7980/) has coverage in tablet_copy-itest.

-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] Fix tablet state metrics race

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: Fix tablet state metrics race
......................................................................


Patch Set 4: Code-Review+2

sgtm then, thanks

-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] Fix tablet state metrics race

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: Fix tablet state metrics race
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8082/3/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 202:     tablet_state_counts_(),
> warning: initializer for member 'tablet_state_counts_' is redundant [readab
Done


http://gerrit.cloudera.org:8080/#/c/8082/3/src/kudu/tserver/ts_tablet_manager.h
File src/kudu/tserver/ts_tablet_manager.h:

Line 20: #include <functional>
> warning: #includes are not sorted properly [llvm-include-order]
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] Fix tablet state metrics race

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/8082

to look at the new patch set (#2).

Change subject: Fix tablet state metrics race
......................................................................

Fix tablet state metrics race

6fef9ca5c9d1dcf571795c514cb9fecc493d0714 added tablet state summary
metrics by adding a struct with cached counts that was periodically
refreshed by walking the tablet manager, but also added a race
between reading the cached counts and refreshing them. This fixes the
race by centralizing reading and writing the counts in the tablet
manager and protecting them with the tablet manager's lock.

Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
---
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/ts_tablet_manager_metrics.cc
M src/kudu/tserver/ts_tablet_manager_metrics.h
4 files changed, 194 insertions(+), 195 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/8082/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] Fix tablet state metrics race

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has submitted this change and it was merged.

Change subject: Fix tablet state metrics race
......................................................................


Fix tablet state metrics race

6fef9ca5c9d1dcf571795c514cb9fecc493d0714 added tablet state summary
metrics by adding a struct with cached counts that was periodically
refreshed by walking the tablet manager, but also added a race
between reading the cached counts and refreshing them. This fixes the
race by centralizing reading and writing the counts in the tablet
manager and protecting them with the tablet manager's lock.

Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Reviewed-on: http://gerrit.cloudera.org:8080/8082
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <to...@apache.org>
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
---
M src/kudu/tserver/CMakeLists.txt
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
D src/kudu/tserver/ts_tablet_manager_metrics.cc
D src/kudu/tserver/ts_tablet_manager_metrics.h
5 files changed, 106 insertions(+), 273 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Todd Lipcon: Looks good to me, but someone else must approve
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Fix tablet state metrics race

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: Fix tablet state metrics race
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8082/2/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 1246: int TSTabletManager::NumNotInitializedTablets() {
instead of all these separate functions, why not Bind directly to RefreshTabletStateCacheAndReturnCount?

ie attach the metric to Bind(&TSTabletManager::RefreshTabletStateCacheAndReturnCount, Unretained(this), tablet::NOT_INITIALIZED)


PS2, Line 1287: num_not_initialized
would this code be simpler to just use an unordered_map<TabletStatePB, int> instead of separate fields for each?


http://gerrit.cloudera.org:8080/#/c/8082/2/src/kudu/tserver/ts_tablet_manager.h
File src/kudu/tserver/ts_tablet_manager.h:

Line 313:   // The caller should hold lock_.
it doesn't seem like the caller holds lock


-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] Fix tablet state metrics race

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: Fix tablet state metrics race
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8082/1/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 1247: int TSTabletManager::NumNotInitializedTablets() {
> Can you borrow from an earlier approach where RefreshTabletStateCacheUnlock
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] Fix tablet state metrics race

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: Fix tablet state metrics race
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8082/2/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 1246:   DataDirManager* dd_manager = fs_manager_->dd_manager();
> instead of all these separate functions, why not Bind directly to RefreshTa
Done


PS2, Line 1287: 
> would this code be simpler to just use an unordered_map<TabletStatePB, int>
Done


http://gerrit.cloudera.org:8080/#/c/8082/2/src/kudu/tserver/ts_tablet_manager.h
File src/kudu/tserver/ts_tablet_manager.h:

Line 313:   typedef std::unordered_map<std::string, scoped_refptr<tablet::TabletReplica> > TabletMap;
> it doesn't seem like the caller holds lock
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] Fix tablet state metrics race

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: Fix tablet state metrics race
......................................................................


Patch Set 4:

lgtm, but is there a test you can reuse/easily add to?

-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] Fix tablet state metrics race

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: Fix tablet state metrics race
......................................................................


Patch Set 4: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] Fix tablet state metrics race

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/8082

to look at the new patch set (#3).

Change subject: Fix tablet state metrics race
......................................................................

Fix tablet state metrics race

6fef9ca5c9d1dcf571795c514cb9fecc493d0714 added tablet state summary
metrics by adding a struct with cached counts that was periodically
refreshed by walking the tablet manager, but also added a race
between reading the cached counts and refreshing them. This fixes the
race by centralizing reading and writing the counts in the tablet
manager and protecting them with the tablet manager's lock.

Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
---
M src/kudu/tserver/CMakeLists.txt
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
D src/kudu/tserver/ts_tablet_manager_metrics.cc
D src/kudu/tserver/ts_tablet_manager_metrics.h
5 files changed, 106 insertions(+), 272 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/8082/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Fix tablet state metrics race

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: Fix tablet state metrics race
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8082/1/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 1247: int TSTabletManager::NumNotInitializedTablets() {
Can you borrow from an earlier approach where RefreshTabletStateCacheUnlocked:
1) Took the lock, and
2) Returned the specific metric asked for.

That way each of these methods can be just one line.


-- 
To view, visit http://gerrit.cloudera.org:8080/8082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88585c04692a6dc168b3750be9888769240cf3f6
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes