You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2017/11/15 20:07:41 UTC

[kudu-CR] wire protocol: optimize RewriteRowBlockPointers

Hello Andrew Wong,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/8554

to review the following change.


Change subject: wire_protocol: optimize RewriteRowBlockPointers
......................................................................

wire_protocol: optimize RewriteRowBlockPointers

I was looking at why our tpch1 benchmark performance seems to have been
creeping up over recent months and started profiling it a bit. I didn't
see any reason for a regression but did notice that
RewriteRowBlockPointers ends up being memory-bandwidth-bound, so took a
few minutes to optimize it.

The old code did one pass through the data for each column. In the case
that the rowblock was larger than CPU cache, this meant streaming a lot
of data from L2, L3, or even memory, which bottlenecked it. The new code
is branchier but does only a single pass through the data.

This reduced the client side CPU usage of the query in tpch_real_world
from about 70ms/iteration to about 40ms/iteration.

Change-Id: Ie428e595e9f564762bbdbd09dc6a5f312abe9aec
---
M src/kudu/common/wire_protocol.cc
1 file changed, 48 insertions(+), 29 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/54/8554/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8554
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie428e595e9f564762bbdbd09dc6a5f312abe9aec
Gerrit-Change-Number: 8554
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>

[kudu-CR] wire protocol: optimize RewriteRowBlockPointers

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8554 )

Change subject: wire_protocol: optimize RewriteRowBlockPointers
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8554/1/src/kudu/common/wire_protocol.cc
File src/kudu/common/wire_protocol.cc:

http://gerrit.cloudera.org:8080/#/c/8554/1/src/kudu/common/wire_protocol.cc@563
PS1, Line 563: size_t row_stride =
nit: const?


http://gerrit.cloudera.org:8080/#/c/8554/1/src/kudu/common/wire_protocol.cc@582
PS1, Line 582: , making the
             :   // lower loop more efficient.
nit: More efficient than what? not sure this comment quite makes sense without looking at a multi-pass implementation.


http://gerrit.cloudera.org:8080/#/c/8554/1/src/kudu/common/wire_protocol.cc@598
PS1, Line 598: col.type_info()->physical_type() != BINARY
nit: can you switch this in for the ==BINARY case? Then there's no need to think about the negative


http://gerrit.cloudera.org:8080/#/c/8554/1/src/kudu/common/wire_protocol.cc@608
PS1, Line 608: buffers are
Not sure I'm following this comment. Which buffers are these?


http://gerrit.cloudera.org:8080/#/c/8554/1/src/kudu/common/wire_protocol.cc@617
PS1, Line 617: !t.nullable || !BitmapTest(row_ptr + null_bitmap_offset, t.col_idx)
nit: maybe consider changing to something like

 // We don't need to rewrite null values.
 if (t.nullable && BitmapTest(row_ptr + null_bitmap_offset, t.col_idx) continue;

 ... body of the current if-block...


http://gerrit.cloudera.org:8080/#/c/8554/1/src/kudu/common/wire_protocol.cc@629
PS1, Line 629: Substitute("Row #$0 contained bad indirect slice for column $1: ($2, $3)",
nit: not your fault, but spacing



-- 
To view, visit http://gerrit.cloudera.org:8080/8554
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie428e595e9f564762bbdbd09dc6a5f312abe9aec
Gerrit-Change-Number: 8554
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 17 Nov 2017 01:28:27 +0000
Gerrit-HasComments: Yes

[kudu-CR] wire protocol: optimize RewriteRowBlockPointers

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Andrew Wong, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8554

to look at the new patch set (#2).

Change subject: wire_protocol: optimize RewriteRowBlockPointers
......................................................................

wire_protocol: optimize RewriteRowBlockPointers

I was looking at why our tpch1 benchmark performance seems to have been
creeping up over recent months and started profiling it a bit. I didn't
see any reason for a regression but did notice that
RewriteRowBlockPointers ends up being memory-bandwidth-bound, so took a
few minutes to optimize it.

The old code did one pass through the data for each column. In the case
that the rowblock was larger than CPU cache, this meant streaming a lot
of data from L2, L3, or even memory, which bottlenecked it. The new code
is branchier but does only a single pass through the data.

This reduced the client side CPU usage of the query in tpch_real_world
from about 70ms/iteration to about 40ms/iteration.

Change-Id: Ie428e595e9f564762bbdbd09dc6a5f312abe9aec
---
M src/kudu/common/wire_protocol.cc
1 file changed, 67 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/54/8554/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8554
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie428e595e9f564762bbdbd09dc6a5f312abe9aec
Gerrit-Change-Number: 8554
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] wire protocol: optimize RewriteRowBlockPointers

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8554 )

Change subject: wire_protocol: optimize RewriteRowBlockPointers
......................................................................

wire_protocol: optimize RewriteRowBlockPointers

I was looking at why our tpch1 benchmark performance seems to have been
creeping up over recent months and started profiling it a bit. I didn't
see any reason for a regression but did notice that
RewriteRowBlockPointers ends up being memory-bandwidth-bound, so took a
few minutes to optimize it.

The old code did one pass through the data for each column. In the case
that the rowblock was larger than CPU cache, this meant streaming a lot
of data from L2, L3, or even memory, which bottlenecked it. The new code
is branchier but does only a single pass through the data.

This reduced the client side CPU usage of the query in tpch_real_world
from about 70ms/iteration to about 40ms/iteration.

Change-Id: Ie428e595e9f564762bbdbd09dc6a5f312abe9aec
Reviewed-on: http://gerrit.cloudera.org:8080/8554
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
M src/kudu/common/wire_protocol.cc
1 file changed, 67 insertions(+), 46 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Andrew Wong: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/8554
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie428e595e9f564762bbdbd09dc6a5f312abe9aec
Gerrit-Change-Number: 8554
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] wire protocol: optimize RewriteRowBlockPointers

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8554 )

Change subject: wire_protocol: optimize RewriteRowBlockPointers
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/8554
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie428e595e9f564762bbdbd09dc6a5f312abe9aec
Gerrit-Change-Number: 8554
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Mon, 20 Nov 2017 20:39:31 +0000
Gerrit-HasComments: No