You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2016/08/31 22:47:24 UTC

[kudu-CR] compaction policy: avoid O(n^2) calls to EstimateOnDiskSize

Hello David Ribeiro Alves,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/4191

to review the following change.

Change subject: compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize
......................................................................

compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize

In a cluster workload with a 130GB+ tablet, I found that the maintenance
manager scheduler thread was spending tens of seconds inside
RowSetInfo::CollectOrdered(), mostly inside calls to
EstimateOnDiskSize(). While any individual call is not exceedingly slow,
they involve a lot of virtual function calls and potential CPU cache
misses, so it appears to add up.

Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
---
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
2 files changed, 13 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/4191/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>

[kudu-CR] compaction policy: avoid O(n^2) calls to EstimateOnDiskSize

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize
......................................................................


Patch Set 2:

Build Started http://104.196.14.100/job/kudu-gerrit/3244/

-- 
To view, visit http://gerrit.cloudera.org:8080/4191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] compaction policy: avoid O(n^2) calls to EstimateOnDiskSize

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize
......................................................................


Patch Set 1:

LGTM but could you add the numbers you obtained to the commit message

-- 
To view, visit http://gerrit.cloudera.org:8080/4191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] compaction policy: avoid O(n^2) calls to EstimateOnDiskSize

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize
......................................................................


compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize

In a cluster workload with a 130GB+ tablet, I found that the maintenance
manager scheduler thread was spending tens of seconds inside
RowSetInfo::CollectOrdered(), mostly inside calls to
EstimateOnDiskSize(). While any individual call is not exceedingly slow,
they involve a lot of virtual function calls and potential CPU cache
misses, so it appears to add up.

I deployed this patch on the cluster and found that the
MaintenanceManager 'FindBestOps' call went from ~16 seconds to ~350ms.

Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Reviewed-on: http://gerrit.cloudera.org:8080/4191
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <dr...@apache.org>
---
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
2 files changed, 13 insertions(+), 6 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] compaction policy: avoid O(n^2) calls to EstimateOnDiskSize

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4191

to look at the new patch set (#2).

Change subject: compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize
......................................................................

compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize

In a cluster workload with a 130GB+ tablet, I found that the maintenance
manager scheduler thread was spending tens of seconds inside
RowSetInfo::CollectOrdered(), mostly inside calls to
EstimateOnDiskSize(). While any individual call is not exceedingly slow,
they involve a lot of virtual function calls and potential CPU cache
misses, so it appears to add up.

I deployed this patch on the cluster and found that the
MaintenanceManager 'FindBestOps' call went from ~16 seconds to ~350ms.

Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
---
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
2 files changed, 13 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/4191/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] compaction policy: avoid O(n^2) calls to EstimateOnDiskSize

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize
......................................................................


Patch Set 1:

Build Started http://104.196.14.100/job/kudu-gerrit/3172/

-- 
To view, visit http://gerrit.cloudera.org:8080/4191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No

[kudu-CR] compaction policy: avoid O(n^2) calls to EstimateOnDiskSize

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize
......................................................................


Patch Set 1:

I put this on the test cluster in question and the FindBestOp call went from taking about 16 seconds to taking about 350ms

-- 
To view, visit http://gerrit.cloudera.org:8080/4191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] compaction policy: avoid O(n^2) calls to EstimateOnDiskSize

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No