You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2019/04/13 04:12:16 UTC

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Hello Mike Percy, Todd Lipcon,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/13011

to review the following change.


Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................

generic_iterators: MergeIterator whole block copy optimization

This patch adds the "whole block copy" optimization to the MergeIterator.
The idea is simple: whenever there's only one sub-iterator in the merge
window, we can copy the entire block out of that sub-iterator instead of
copying row-by-row.

The challenging aspect was properly handling the client's SelectionVector:
- When copying row-by-row, the MergeIterator must skip deselected rows in
  order to always return the next smallest row. The MergeIterState
  bookkeeping helped enforce this invariant.
- When copying block-by-block, skipping deselected rows is harder (and
  potentially less performant) than the simple bitmap copy we currently do.
  Plus it's not necessary for correctness; the scan endpoint will skip
  deselected rows when serializing the RowBlock.

So I opted to retain deselected rows in the block-by-block case, and updated
the MergeIterState bookkeeping to cope.

I also changed the default RowBlock sizes in various merge-related paths to
be a power of 2. This should increase the likelihood of hitting the
BitmapCopy "fast path" during a RowBlock copy, though it doesn't guarantee
it (e.g. consider a merge where two sub-iterators overlap for 63/128 rows,
then one sub-iterator has an additional 65 rows: the BitmapCopy operations
in the block copy could not be byte-aligned). It yielded a slight
improvement in microbenchmarks, though not in the macrobenchmark.

Finally, I tweaked the heuristic for whole-block-copying such that it only
happens when there's more than one row to copy. I don't totally buy the
rationale, but it did yield an improvement in both the micro and macro
benchmarks, so I must be onto something.

Below are the micro and macrobenchmark results. The microbenchmark was
generic_iterators-test's overlapping and non-overlapping MergeIterator tests
with 10 iterations, 1000 lists, and 10000 rows per list, collecting the
wallclock time to scan and averaging it across the runs. The macrobenchmark
was running 'kudu perf tablet_scan' on a representative 40GB tablet six
times, dropping the first time (to reduce cache effects), and averaging the
remaining wallclock times.

no whole-block-copy:
- non-overlapping: 0.9437
- overlapping: 9.5113
- representative tablet: 786.732

whole-block-copy, no power-of-2 RowBlocks
- non-overlapping: 0.6301
- overlapping: 9.7518
- representative tablet: 751.297

whole-block-copy, power-of-2 RowBlocks
- non-overlapping: 0.5316
- overlapping: 9.5718
- representative tablet: 754.961

whole-block-copy, power-of-2 RowBlocks, only copy when >1 rows left:
- non-overlapping: 0.5247
- overlapping: 9.1287
- representative tablet: 720.534

Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/rowblock.h
M src/kudu/tserver/tablet_service.cc
4 files changed, 126 insertions(+), 62 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/13011/1
-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13011 )

Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................

generic_iterators: MergeIterator whole block copy optimization

This patch adds the "whole block copy" optimization to the MergeIterator.
The idea is simple: whenever there's only one sub-iterator in the merge
window, we can copy the entire block out of that sub-iterator instead of
copying row-by-row.

The challenging aspect was properly handling the client's SelectionVector:
- When copying row-by-row, the MergeIterator must skip deselected rows in
  order to always return the next smallest row. The MergeIterState
  bookkeeping helped enforce this invariant.
- When copying block-by-block, skipping deselected rows is harder (and
  potentially less performant) than the simple bitmap copy we currently do.
  Plus it's not necessary for correctness; the scan endpoint will skip
  deselected rows when serializing the RowBlock.

So I opted to retain deselected rows in the block-by-block case, and updated
the MergeIterState bookkeeping to cope.

I also changed the default RowBlock sizes in various merge-related paths to
be a power of 2. This should increase the likelihood of hitting the
BitmapCopy "fast path" during a RowBlock copy, though it doesn't guarantee
it (e.g. consider a merge where two sub-iterators overlap for 63/128 rows,
then one sub-iterator has an additional 65 rows: the BitmapCopy operations
in the block copy could not be byte-aligned). It yielded a slight
improvement in microbenchmarks, though not in the macrobenchmark.

Finally, I tweaked the heuristic for whole-block-copying such that it only
happens when there's more than one row to copy. I don't totally buy the
rationale, but it did yield an improvement in both the micro and macro
benchmarks, so I must be onto something.

Below are the micro and macrobenchmark results. The microbenchmark was
generic_iterators-test's overlapping and non-overlapping MergeIterator tests
with 10 iterations, 1000 lists, and 10000 rows per list, collecting the
wallclock time to scan and averaging it across the runs. The macrobenchmark
was running 'kudu perf tablet_scan' on a representative 40GB tablet six
times, dropping the first time (to reduce cache effects), and averaging the
remaining wallclock times.

no whole-block-copy:
- non-overlapping: 0.9437
- overlapping: 9.5113
- representative tablet: 786.732

whole-block-copy, no power-of-2 RowBlocks
- non-overlapping: 0.6301
- overlapping: 9.7518
- representative tablet: 751.297

whole-block-copy, power-of-2 RowBlocks
- non-overlapping: 0.5316
- overlapping: 9.5718
- representative tablet: 754.961

whole-block-copy, power-of-2 RowBlocks, only copy when >1 rows left:
- non-overlapping: 0.5247
- overlapping: 9.1287
- representative tablet: 720.534

Todd and I discussed another possible optimization which I want to note here
for posterity: rather than copying blocks, we could "attach" the data to the
client's RowbBlock and serialize it to the wire directly. The attachment
could be RowBlock-based by allowing RowBlocks to work like a refcounted
"iovec", or it could be done more deeply to enable direct serialization of
cached decoder data. Either way, anything that helps us avoid copying data
is likely to be a performance win.

Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Reviewed-on: http://gerrit.cloudera.org:8080/13011
Reviewed-by: Mike Percy <mp...@apache.org>
Tested-by: Adar Dembo <ad...@cloudera.com>
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/rowblock.h
M src/kudu/tserver/tablet_service.cc
4 files changed, 129 insertions(+), 62 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Adar Dembo: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 7
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/13011 )

Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................


Patch Set 5: Code-Review+1

(3 comments)

Nice work

http://gerrit.cloudera.org:8080/#/c/13011/5/src/kudu/common/generic_iterators.cc
File src/kudu/common/generic_iterators.cc:

http://gerrit.cloudera.org:8080/#/c/13011/5/src/kudu/common/generic_iterators.cc@262
PS5, Line 262:       read_block_->selection_vector()->FindFirstRowSelected(next_row_idx_, &idx)) {
Is this the additional bookkeeping you are referring to in the commit message?


http://gerrit.cloudera.org:8080/#/c/13011/5/src/kudu/common/generic_iterators.cc@792
PS5, Line 792: row
block


http://gerrit.cloudera.org:8080/#/c/13011/5/src/kudu/common/generic_iterators.cc@897
PS5, Line 897:  
nit: /*num_rows_to_advance=*/



-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 17 Apr 2019 20:45:42 +0000
Gerrit-HasComments: Yes

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13011 )

Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................


Patch Set 6: Verified+1

Overriding Jenkins, unrelated test failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 17 Apr 2019 21:46:13 +0000
Gerrit-HasComments: No

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/13011 )

Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................


Patch Set 6: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13011/5/src/kudu/common/generic_iterators.cc
File src/kudu/common/generic_iterators.cc:

http://gerrit.cloudera.org:8080/#/c/13011/5/src/kudu/common/generic_iterators.cc@262
PS5, Line 262:       read_block_->selection_vector()->FindFirstRowSelected(next_row_idx_, &idx)) {
> No, this is a more efficient implementation of the existing code from L257-
Got it, thanks for clarifying.



-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 17 Apr 2019 21:31:04 +0000
Gerrit-HasComments: Yes

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13011 )

Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................


Patch Set 4: Verified+1

Overriding Jenkins, unrelated (probably flaky?) test failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 15 Apr 2019 23:21:33 +0000
Gerrit-HasComments: No

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13011

to look at the new patch set (#6).

Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................

generic_iterators: MergeIterator whole block copy optimization

This patch adds the "whole block copy" optimization to the MergeIterator.
The idea is simple: whenever there's only one sub-iterator in the merge
window, we can copy the entire block out of that sub-iterator instead of
copying row-by-row.

The challenging aspect was properly handling the client's SelectionVector:
- When copying row-by-row, the MergeIterator must skip deselected rows in
  order to always return the next smallest row. The MergeIterState
  bookkeeping helped enforce this invariant.
- When copying block-by-block, skipping deselected rows is harder (and
  potentially less performant) than the simple bitmap copy we currently do.
  Plus it's not necessary for correctness; the scan endpoint will skip
  deselected rows when serializing the RowBlock.

So I opted to retain deselected rows in the block-by-block case, and updated
the MergeIterState bookkeeping to cope.

I also changed the default RowBlock sizes in various merge-related paths to
be a power of 2. This should increase the likelihood of hitting the
BitmapCopy "fast path" during a RowBlock copy, though it doesn't guarantee
it (e.g. consider a merge where two sub-iterators overlap for 63/128 rows,
then one sub-iterator has an additional 65 rows: the BitmapCopy operations
in the block copy could not be byte-aligned). It yielded a slight
improvement in microbenchmarks, though not in the macrobenchmark.

Finally, I tweaked the heuristic for whole-block-copying such that it only
happens when there's more than one row to copy. I don't totally buy the
rationale, but it did yield an improvement in both the micro and macro
benchmarks, so I must be onto something.

Below are the micro and macrobenchmark results. The microbenchmark was
generic_iterators-test's overlapping and non-overlapping MergeIterator tests
with 10 iterations, 1000 lists, and 10000 rows per list, collecting the
wallclock time to scan and averaging it across the runs. The macrobenchmark
was running 'kudu perf tablet_scan' on a representative 40GB tablet six
times, dropping the first time (to reduce cache effects), and averaging the
remaining wallclock times.

no whole-block-copy:
- non-overlapping: 0.9437
- overlapping: 9.5113
- representative tablet: 786.732

whole-block-copy, no power-of-2 RowBlocks
- non-overlapping: 0.6301
- overlapping: 9.7518
- representative tablet: 751.297

whole-block-copy, power-of-2 RowBlocks
- non-overlapping: 0.5316
- overlapping: 9.5718
- representative tablet: 754.961

whole-block-copy, power-of-2 RowBlocks, only copy when >1 rows left:
- non-overlapping: 0.5247
- overlapping: 9.1287
- representative tablet: 720.534

Todd and I discussed another possible optimization which I want to note here
for posterity: rather than copying blocks, we could "attach" the data to the
client's RowbBlock and serialize it to the wire directly. The attachment
could be RowBlock-based by allowing RowBlocks to work like a refcounted
"iovec", or it could be done more deeply to enable direct serialization of
cached decoder data. Either way, anything that helps us avoid copying data
is likely to be a performance win.

Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/rowblock.h
M src/kudu/tserver/tablet_service.cc
4 files changed, 129 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/13011/6
-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/13011 )

Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13011 )

Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13011/5/src/kudu/common/generic_iterators.cc
File src/kudu/common/generic_iterators.cc:

http://gerrit.cloudera.org:8080/#/c/13011/5/src/kudu/common/generic_iterators.cc@262
PS5, Line 262:       read_block_->selection_vector()->FindFirstRowSelected(next_row_idx_, &idx)) {
> Is this the additional bookkeeping you are referring to in the commit messa
No, this is a more efficient implementation of the existing code from L257-264.

The bookkeeping change I alluded to was the removal of rows_advanced_ and rows_valid_, because both operated in terms of selected rows only, but now we don't care as much about distinguishing between selected and deselected rows, so I removed them in favor of using next_row_idx_ for "where are we?" and read_block_->nrows() for "how many rows exist?".


http://gerrit.cloudera.org:8080/#/c/13011/5/src/kudu/common/generic_iterators.cc@792
PS5, Line 792: blo
> block
Done


http://gerrit.cloudera.org:8080/#/c/13011/5/src/kudu/common/generic_iterators.cc@897
PS5, Line 897:  
> nit: /*num_rows_to_advance=*/
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 17 Apr 2019 21:10:34 +0000
Gerrit-HasComments: Yes

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13011

to look at the new patch set (#4).

Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................

generic_iterators: MergeIterator whole block copy optimization

This patch adds the "whole block copy" optimization to the MergeIterator.
The idea is simple: whenever there's only one sub-iterator in the merge
window, we can copy the entire block out of that sub-iterator instead of
copying row-by-row.

The challenging aspect was properly handling the client's SelectionVector:
- When copying row-by-row, the MergeIterator must skip deselected rows in
  order to always return the next smallest row. The MergeIterState
  bookkeeping helped enforce this invariant.
- When copying block-by-block, skipping deselected rows is harder (and
  potentially less performant) than the simple bitmap copy we currently do.
  Plus it's not necessary for correctness; the scan endpoint will skip
  deselected rows when serializing the RowBlock.

So I opted to retain deselected rows in the block-by-block case, and updated
the MergeIterState bookkeeping to cope.

I also changed the default RowBlock sizes in various merge-related paths to
be a power of 2. This should increase the likelihood of hitting the
BitmapCopy "fast path" during a RowBlock copy, though it doesn't guarantee
it (e.g. consider a merge where two sub-iterators overlap for 63/128 rows,
then one sub-iterator has an additional 65 rows: the BitmapCopy operations
in the block copy could not be byte-aligned). It yielded a slight
improvement in microbenchmarks, though not in the macrobenchmark.

Finally, I tweaked the heuristic for whole-block-copying such that it only
happens when there's more than one row to copy. I don't totally buy the
rationale, but it did yield an improvement in both the micro and macro
benchmarks, so I must be onto something.

Below are the micro and macrobenchmark results. The microbenchmark was
generic_iterators-test's overlapping and non-overlapping MergeIterator tests
with 10 iterations, 1000 lists, and 10000 rows per list, collecting the
wallclock time to scan and averaging it across the runs. The macrobenchmark
was running 'kudu perf tablet_scan' on a representative 40GB tablet six
times, dropping the first time (to reduce cache effects), and averaging the
remaining wallclock times.

no whole-block-copy:
- non-overlapping: 0.9437
- overlapping: 9.5113
- representative tablet: 786.732

whole-block-copy, no power-of-2 RowBlocks
- non-overlapping: 0.6301
- overlapping: 9.7518
- representative tablet: 751.297

whole-block-copy, power-of-2 RowBlocks
- non-overlapping: 0.5316
- overlapping: 9.5718
- representative tablet: 754.961

whole-block-copy, power-of-2 RowBlocks, only copy when >1 rows left:
- non-overlapping: 0.5247
- overlapping: 9.1287
- representative tablet: 720.534

Todd and I discussed another possible optimization which I want to note here
for posterity: rather than copying blocks, we could "attach" the data to the
client's RowbBlock and serialize it to the wire directly. The attachment
could be RowBlock-based by allowing RowBlocks to work like a refcounted
"iovec", or it could be done more deeply to enable direct serialization of
cached decoder data. Either way, anything that helps us avoid copying data
is likely to be a performance win.

Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
---
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/rowblock.h
M src/kudu/tserver/tablet_service.cc
4 files changed, 129 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/13011/4
-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] generic iterators: MergeIterator whole block copy optimization

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/13011 )

Change subject: generic_iterators: MergeIterator whole block copy optimization
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/13011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I3a4b56169644f35f1aa8aef27d4af0ce4ec0792d
Gerrit-Change-Number: 13011
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>