You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2019/04/05 22:07:20 UTC

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Hello Mike Percy, Todd Lipcon,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/12946

to review the following change.


Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................

generic_iterators: pass rowset bounds into grouping iterators

The rowset bounds will be used to reduce MergeIterator memory consumption by
restricting the "eager RowBlock fetching" behavior at Init time to only
those sub-iterators whose rowsets could conceivably participate in the
beginning of the merge.

Normally the rowset bounds are encoded, but they need to be decoded upon
arrival because they're going to be compared against decoded rows returned
from sub-iterators, and it's far more efficient to decode the bounds once
when initializing the MergeIterator than to encode the rows repeatedly
during the merge.

Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
7 files changed, 174 insertions(+), 84 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/46/12946/1
-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12946 )

Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................


Patch Set 2: Verified+1

Overriding Jenkins, super flaky Java test failed three times.


-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 08 Apr 2019 05:15:33 +0000
Gerrit-HasComments: No

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12946

to look at the new patch set (#4).

Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................

generic_iterators: pass rowset bounds into grouping iterators

The rowset bounds will be used to reduce MergeIterator memory consumption by
restricting the "eager RowBlock fetching" behavior at Init time to only
those sub-iterators whose rowsets could conceivably participate in the
beginning of the merge.

Normally the rowset bounds are encoded, but they need to be decoded upon
arrival because they're going to be compared against decoded rows returned
from sub-iterators, and it's far more efficient to decode the bounds once
when initializing the MergeIterator than to encode the rows repeatedly
during the merge.

Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
7 files changed, 172 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/46/12946/4
-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/12946 )

Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12946 )

Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................


Patch Set 7: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 7
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Sat, 13 Apr 2019 04:27:34 +0000
Gerrit-HasComments: No

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12946

to look at the new patch set (#7).

Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................

generic_iterators: pass rowset bounds into grouping iterators

The rowset bounds will be used to reduce MergeIterator memory consumption by
restricting the "eager RowBlock fetching" behavior at Init time to only
those sub-iterators whose rowsets could conceivably participate in the
beginning of the merge.

Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
7 files changed, 171 insertions(+), 83 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/46/12946/7
-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 7
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12946

to look at the new patch set (#2).

Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................

generic_iterators: pass rowset bounds into grouping iterators

The rowset bounds will be used to reduce MergeIterator memory consumption by
restricting the "eager RowBlock fetching" behavior at Init time to only
those sub-iterators whose rowsets could conceivably participate in the
beginning of the merge.

Normally the rowset bounds are encoded, but they need to be decoded upon
arrival because they're going to be compared against decoded rows returned
from sub-iterators, and it's far more efficient to decode the bounds once
when initializing the MergeIterator than to encode the rows repeatedly
during the merge.

Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
7 files changed, 172 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/46/12946/2
-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12946 )

Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................

generic_iterators: pass rowset bounds into grouping iterators

The rowset bounds will be used to reduce MergeIterator memory consumption by
restricting the "eager RowBlock fetching" behavior at Init time to only
those sub-iterators whose rowsets could conceivably participate in the
beginning of the merge.

Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Reviewed-on: http://gerrit.cloudera.org:8080/12946
Reviewed-by: Mike Percy <mp...@apache.org>
Tested-by: Kudu Jenkins
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
7 files changed, 171 insertions(+), 83 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12946 )

Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................


Patch Set 6: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12946/6/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

http://gerrit.cloudera.org:8080/#/c/12946/6/src/kudu/tablet/rowset.h@153
PS6, Line 153:   virtual Status NewRowIteratorWithBounds(const RowIteratorOptions& opts,
I'm curious: why did you put the implementation in the header file? NewRowIterator() is not inlined.


http://gerrit.cloudera.org:8080/#/c/12946/6/src/kudu/tablet/rowset.h@167
PS6, Line 167: std::move(encoded_bounds);
nit: would be more terse to say:

  out->encoded_bounds = std::make_pair(std::move(lower), std::move(upper));



-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Sat, 13 Apr 2019 00:45:20 +0000
Gerrit-HasComments: Yes

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12946 )

Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12946/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12946/6//COMMIT_MSG@14
PS6, Line 14: Normally the rowset bounds are encoded, but they need to be decoded upon
> I'm confused about this part. Is this comment still relevant to this patch,
I was trying to draw attention to the difference between the bounds retrieved from RowSet::GetBounds, which are encoded, and the rows retrieved from sub-iterators, which are decoded. I'm trying to say that it's more efficient to decode the encoded bounds once and store them that way than it is to encode the retrieved rows during the merge.

You're right that some of this explanation is not relevant to this patch because, since writing this, I moved the decoding into a different patch. I'll update it.


http://gerrit.cloudera.org:8080/#/c/12946/6/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

http://gerrit.cloudera.org:8080/#/c/12946/6/src/kudu/tablet/rowset.h@153
PS6, Line 153:   virtual Status NewRowIteratorWithBounds(const RowIteratorOptions& opts,
> I'm curious: why did you put the implementation in the header file? NewRowI
If there was a good reason, I can't remember it. I'll move it out. Not even sure why it's a virtual function.


http://gerrit.cloudera.org:8080/#/c/12946/6/src/kudu/tablet/rowset.h@167
PS6, Line 167: std::move(encoded_bounds);
> nit: would be more terse to say:
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Sat, 13 Apr 2019 04:12:24 +0000
Gerrit-HasComments: Yes

[kudu-CR] generic iterators: pass rowset bounds into grouping iterators

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12946 )

Change subject: generic_iterators: pass rowset bounds into grouping iterators
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12946/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12946/6//COMMIT_MSG@14
PS6, Line 14: Normally the rowset bounds are encoded, but they need to be decoded upon
I'm confused about this part. Is this comment still relevant to this patch, where we are passing the encoded bounds into the IterWithBounds?



-- 
To view, visit http://gerrit.cloudera.org:8080/12946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id1aebd84507f87c869d781e117225c37c8f15969
Gerrit-Change-Number: 12946
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Sat, 13 Apr 2019 00:49:15 +0000
Gerrit-HasComments: Yes