You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org> on 2019/10/07 15:35:11 UTC

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Bankim Bhavsar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14380


Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................

[cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

This change adds optimization that evaluates the predicate for each run
instead of materializing each cell and then applying the predicate
for integer datatype RLE decoder.

Tests:
- Extended all_types-scan-correctness-test with repeating
data instead of sequential to better test data encoded
with RLE.
- Verified all unit tests pass.
- TODO: Run benchmark to confirm/measure performance gains.

Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
---
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/rle_block.h
M src/kudu/tablet/all_types-scan-correctness-test.cc
M src/kudu/util/rle-encoding.h
4 files changed, 257 insertions(+), 53 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/14380/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 1
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 5:

(5 comments)

Just passing through with some style comments.

http://gerrit.cloudera.org:8080/#/c/14380/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14380/5//COMMIT_MSG@18
PS5, Line 18: Tests:
Although this level of detail is welcome, we're not super rigid about specifying tests in the commit message given that the code review speaks for itself. Looking at what you've written here, I'd drop everything except the benchmark bit, and promote that into an (informal) paragraph after "Added a utility method...".


http://gerrit.cloudera.org:8080/#/c/14380/5//COMMIT_MSG@24
PS5, Line 24:   - large number of rows 100k to 1M with run lengths of 1k to 10k
Perhaps clearer as "100k to 1M rows with run lengths ranging from 1k to 10k"?


http://gerrit.cloudera.org:8080/#/c/14380/5//COMMIT_MSG@28
PS5, Line 28: materializing_iterator_decoder_eval
We generally prepend '--' to gflag names to emphasize that they're gflags and not something else (like a standard variable name).


http://gerrit.cloudera.org:8080/#/c/14380/5/src/kudu/common/rowblock.h
File src/kudu/common/rowblock.h:

http://gerrit.cloudera.org:8080/#/c/14380/5/src/kudu/common/rowblock.h@221
PS5, Line 221:         false);
Nit: this should be aligned with sel_vec->mutable_bitmap() from L220.


http://gerrit.cloudera.org:8080/#/c/14380/5/src/kudu/tablet/all_types-scan-correctness-test.cc
File src/kudu/tablet/all_types-scan-correctness-test.cc:

http://gerrit.cloudera.org:8080/#/c/14380/5/src/kudu/tablet/all_types-scan-correctness-test.cc@228
PS5, Line 228:     int upper_val) {
Nit: should be aligned with "int nrows". See https://google.github.io/styleguide/cppguide.html#Function_Calls for details.



-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 5
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 11 Oct 2019 05:10:17 +0000
Gerrit-HasComments: Yes

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 4:

(1 comment)

> Patch Set 3:
> 
> (1 comment)

http://gerrit.cloudera.org:8080/#/c/14380/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14380/2//COMMIT_MSG@18
PS2, Line 18: Tests:
> Interesting, I'd recommend:
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 4
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 10 Oct 2019 21:01:11 +0000
Gerrit-HasComments: Yes

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 2:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/14380/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14380/2//COMMIT_MSG@18
PS2, Line 18: - TODO: Run benchmark to confirm/measure performance gains.
A simple benchmark that would be very targeted might be to run some RLE-tests with --materializing_iterator_decoder_eval=false and --materializing_iterator_decoder_eval=true and comparing the results.


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/cfile/rle_block.h
File src/kudu/cfile/rle_block.h:

http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/cfile/rle_block.h@429
PS2, Line 429: OVERRIDE
nit: for new code, we prefer the newer style of all lowercase. Not your fault, but rather than using this older style, could you update the usages in this file to 'override'? That could be done in a separate patch if you prefer.


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/cfile/rle_block.h@454
PS2, Line 454: evaluation
nit: we're skipping the copy, not the evaluation


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/cfile/rle_block.h@458
PS2, Line 458:           auto *data_val_ptr = reinterpret_cast<CppType *>(data_ptr);
             :           *data_val_ptr = val;
nit: might be more succinct as:

 *(reinterpret_cast<CppType*>(data_ptr)) = val;


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/cfile/rle_block.h@466
PS2, Line 466: data_ptr += num_read * kCppTypeSize;
I don't quite recall -- do we still need to increment this even though we're not materializing these rows?


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/tablet/all_types-scan-correctness-test.cc
File src/kudu/tablet/all_types-scan-correctness-test.cc:

http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/tablet/all_types-scan-correctness-test.cc@222
PS2, Line 222: cardinality
nit: could you wrap this in ``s or ''s so it's clear you're referring to the variable? Same below


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/tablet/all_types-scan-correctness-test.cc@233
PS2, Line 233: (cardinality * cardinality);
I'm kind of surprised by this. Doesn't 'cardinality' represent the number of rows in each stride? In which case, the number of strides is just nrows/cardinality?

I think I'm missing what a stride is. Could you add a comment example here of how 'nrows', 'cardinality', 'null_upper', and and the 'val's are used, and how they related to these strides?


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/tablet/all_types-scan-correctness-test.cc@235
PS2, Line 235:   int matching_rows_per_chunk = cardinality - null_upper;
This is only true for the strides that fall within [lower_val, upper_val), right?


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/tablet/all_types-scan-correctness-test.cc@306
PS2, Line 306: cardinality
nit: I tend to think of "cardinality" as the number of unique values. Maybe call this "stride_length" or something? Same in ExpectedCountRepeating()

Maybe we should do that everywhere, since `stride_length` still seems like an appropriate name elsewhere.


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/util/rle-encoding.h
File src/kudu/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/util/rle-encoding.h@330
PS2, Line 330:         if (PREDICT_FALSE(!result)) {
             :           return ret;
             :         }
Why was this change necessary? Did it cause one of your tests to fail? If so, could you make note of this in the commit message? Also might be worth adding a test to util/rle-test.cc for this too.



-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 2
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 07 Oct 2019 20:51:11 +0000
Gerrit-HasComments: Yes

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14380

to look at the new patch set (#5).

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................

[cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

This change adds optimization that evaluates the predicate for each run
instead of materializing each cell and then applying the predicate
for integer datatype RLE decoder.

Added a utility method in SelectionVectorView to clear bits
from caller specified offset. This helps clear batch of rows
from a caller maintained offset without advancing the internal
row_offset in SelectionVectorView.

Tests:
- Extended all_types-scan-correctness-test with repeating
data instead of sequential to better test data encoded
with RLE.
- Verified all unit tests pass.
- To benchmark, adjusted all_types-scan-correctness-test with
  - large number of rows 100k to 1M with run lengths of 1k to 10k
  - predicate that selects subset of rows
  - only RLE encoded integer datatypes
  Observed 40% to 60% perf improvement in scan compared to not doing decoder
level evaluation (i.e. materializing_iterator_decoder_eval = false)

Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
---
M src/kudu/cfile/rle_block.h
M src/kudu/common/rowblock.h
M src/kudu/tablet/all_types-scan-correctness-test.cc
3 files changed, 276 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/14380/5
-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 5
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14380

to look at the new patch set (#3).

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................

[cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

This change adds optimization that evaluates the predicate for each run
instead of materializing each cell and then applying the predicate
for integer datatype RLE decoder.

Tests:
- Extended all_types-scan-correctness-test with repeating
data instead of sequential to better test data encoded
with RLE.
- Verified all unit tests pass.
- TODO: Run benchmark to confirm/measure performance gains.

Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
---
M src/kudu/cfile/rle_block.h
M src/kudu/tablet/all_types-scan-correctness-test.cc
2 files changed, 264 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/14380/3
-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14380

to look at the new patch set (#6).

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................

[cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

This change adds optimization that evaluates the predicate for each run
instead of materializing each cell and then applying the predicate
for integer datatype RLE decoder.

Added a utility method in SelectionVectorView to clear bits
from caller specified offset. This helps clear batch of rows
from a caller maintained offset without advancing the internal
row_offset in SelectionVectorView.

Tests:

To benchmark, adjusted the all_types-scan-correctness-test and
tested with 1M rows and run-length of 10k repeating
integer values on a release build.

Following are results with different predicate values where 1st line
is scan time duration with decoder level evaluation and 2nd line
is scan time duration with decoder level evaluation turned off
(i.e. --materializing_iterator_decoder_eval = false)

Small subset: [5 10)
3-5ms
9-12ms

Large subset ~50%: [2000 - 7000)
3-5ms
8-9ms

Select All: [1, 10001)
12-15ms
18-22ms

Select None: [10001 10003)
3-5ms
9-12ms

Biggest improvement of around %60 is seen in cases where small subset or
no rows are selected. As more rows are selected the improvement reduces to
40-50%.

Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
---
M src/kudu/cfile/rle_block.h
M src/kudu/common/rowblock.h
M src/kudu/tablet/all_types-scan-correctness-test.cc
3 files changed, 274 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/14380/6
-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 6
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 7:

> Patch Set 6: Verified-1
> 
> Build Failed 
> 
> http://jenkins.kudu.apache.org/job/kudu-gerrit/19105/ : FAILURE

Looks like unrelated test failure. Rebased to retrigger the verification job.


-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 7
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 14 Oct 2019 22:26:30 +0000
Gerrit-HasComments: No

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 2:

(2 comments)

I just took a quick first look.

http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/cfile/rle_block.h
File src/kudu/cfile/rle_block.h:

http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/cfile/rle_block.h@426
PS4, Line 426: virtual
nit: drop virtual since it's already an override


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/cfile/rle_block.h@450
PS4, Line 450: void *
nit: void*

(stick the asterisk to the type)



-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 2
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 10 Oct 2019 21:31:57 +0000
Gerrit-HasComments: Yes

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14380

to look at the new patch set (#4).

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................

[cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

This change adds optimization that evaluates the predicate for each run
instead of materializing each cell and then applying the predicate
for integer datatype RLE decoder.

Added a utility method in SelectionVectorView to clear bits
from caller specified offset. This helps clear batch of rows
from a caller maintained offset without advancing the internal
row_offset in SelectionVectorView.

Tests:
- Extended all_types-scan-correctness-test with repeating
data instead of sequential to better test data encoded
with RLE.
- Verified all unit tests pass.
- To benchmark, adjusted all_types-scan-correctness-test with
  - large number of rows 100k to 1M with run lengths of 1k to 10k
  - predicate that selects subset of rows
  - only RLE encoded integer datatypes
  Observed 40% to 60% perf improvement in scan compared to not doing decoder
level evaluation (i.e. materializing_iterator_decoder_eval = false)

Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
---
M src/kudu/cfile/rle_block.h
M src/kudu/common/rowblock.h
M src/kudu/tablet/all_types-scan-correctness-test.cc
3 files changed, 278 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/14380/4
-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 4
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 2:

(10 comments)

> Patch Set 2:
> 
> (10 comments)

http://gerrit.cloudera.org:8080/#/c/14380/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14380/2//COMMIT_MSG@18
PS2, Line 18: - TODO: Run benchmark to confirm/measure performance gains.
> A simple benchmark that would be very targeted might be to run some RLE-tes
Ack. Preliminary results don't show improvement.
Working on further analyzing....


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/cfile/rle_block.h
File src/kudu/cfile/rle_block.h:

http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/cfile/rle_block.h@429
PS2, Line 429: OVERRIDE
> nit: for new code, we prefer the newer style of all lowercase. Not your fau
Fixed here. Will make a separate change for updating others.


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/cfile/rle_block.h@454
PS2, Line 454: evaluation
> nit: we're skipping the copy, not the evaluation
Done


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/cfile/rle_block.h@458
PS2, Line 458:           auto *data_val_ptr = reinterpret_cast<CppType *>(data_ptr);
             :           *data_val_ptr = val;
> nit: might be more succinct as:
Done


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/cfile/rle_block.h@466
PS2, Line 466: data_ptr += num_read * kCppTypeSize;
> I don't quite recall -- do we still need to increment this even though we'r
I looked at the implementation in BinaryDictBlockDecoder::CopyNextAndEval() and the out destination data ptr is incremented regardless of whether the row is selected. Moreover caller in CFileIterator::Scan() also advances the dst ptr based on rows read. If dst ptr were to be incremented only when materializing a row then I'd expect perhaps the in/out variable "n" to be incremented only for the number of values materialized which is not the case.

That being said, it'd be good to get second pair of eyes verify whether dst data ptr needs to be incremented when row is not materialized.


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/tablet/all_types-scan-correctness-test.cc
File src/kudu/tablet/all_types-scan-correctness-test.cc:

http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/tablet/all_types-scan-correctness-test.cc@222
PS2, Line 222: cardinality
> nit: could you wrap this in ``s or ''s so it's clear you're referring to th
Done


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/tablet/all_types-scan-correctness-test.cc@233
PS2, Line 233: (cardinality * cardinality);
> I'm kind of surprised by this. Doesn't 'cardinality' represent the number o
I've added an example to explain the math for the variables below as they are computed.


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/tablet/all_types-scan-correctness-test.cc@235
PS2, Line 235:   int matching_rows_per_chunk = cardinality - null_upper;
> This is only true for the strides that fall within [lower_val, upper_val), 
Yes. Added comment with example.


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/tablet/all_types-scan-correctness-test.cc@306
PS2, Line 306: cardinality
> nit: I tend to think of "cardinality" as the number of unique values. Maybe
"cardinality" in this test is number of rows following a certain pattern whether sequential or repeating. "stride_length"  wouldn't apply to sequential pattern. So I'd like to keep using cardinality variable as is.


http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/util/rle-encoding.h
File src/kudu/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/14380/2/src/kudu/util/rle-encoding.h@330
PS2, Line 330:         if (PREDICT_FALSE(!result)) {
             :           return ret;
             :         }
> Why was this change necessary? Did it cause one of your tests to fail? If s
Yes, it causes a bloom filter test to fail. I've filed a separate JIRA and will fix it separately https://issues.apache.org/jira/browse/KUDU-2968



-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 2
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 09 Oct 2019 20:51:27 +0000
Gerrit-HasComments: Yes

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14380

to look at the new patch set (#2).

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................

[cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

This change adds optimization that evaluates the predicate for each run
instead of materializing each cell and then applying the predicate
for integer datatype RLE decoder.

Tests:
- Extended all_types-scan-correctness-test with repeating
data instead of sequential to better test data encoded
with RLE.
- Verified all unit tests pass.
- TODO: Run benchmark to confirm/measure performance gains.

Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
---
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/rle_block.h
M src/kudu/tablet/all_types-scan-correctness-test.cc
M src/kudu/util/rle-encoding.h
4 files changed, 258 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/80/14380/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 2
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/14380/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14380/5//COMMIT_MSG@18
PS5, Line 18: Tests:
> Although this level of detail is welcome, we're not super rigid about speci
Done


http://gerrit.cloudera.org:8080/#/c/14380/5//COMMIT_MSG@24
PS5, Line 24:   - large number of rows 100k to 1M with run lengths of 1k to 10k
> Perhaps clearer as "100k to 1M rows with run lengths ranging from 1k to 10k
Done


http://gerrit.cloudera.org:8080/#/c/14380/5//COMMIT_MSG@28
PS5, Line 28: materializing_iterator_decoder_eval
> We generally prepend '--' to gflag names to emphasize that they're gflags a
Done


http://gerrit.cloudera.org:8080/#/c/14380/5/src/kudu/common/rowblock.h
File src/kudu/common/rowblock.h:

http://gerrit.cloudera.org:8080/#/c/14380/5/src/kudu/common/rowblock.h@221
PS5, Line 221:         false);
> Nit: this should be aligned with sel_vec->mutable_bitmap() from L220.
Done


http://gerrit.cloudera.org:8080/#/c/14380/5/src/kudu/tablet/all_types-scan-correctness-test.cc
File src/kudu/tablet/all_types-scan-correctness-test.cc:

http://gerrit.cloudera.org:8080/#/c/14380/5/src/kudu/tablet/all_types-scan-correctness-test.cc@228
PS5, Line 228:     int upper_val) {
> Nit: should be aligned with "int nrows". See https://google.github.io/style
Done here and other places in this file.



-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 5
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 14 Oct 2019 21:24:22 +0000
Gerrit-HasComments: Yes

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14380/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14380/2//COMMIT_MSG@18
PS2, Line 18: - TODO: Run benchmark to confirm/measure performance gains.
> Ack. Preliminary results don't show improvement.
Interesting, I'd recommend:
- running in release mode
- maybe only timing the scanning iteration (e.g. in case you're timing the entire test, which might have a bunch of setup/teardown involved, especially in debug mode). You might find `LOG_TIMING()` useful for that.



-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 09 Oct 2019 21:21:40 +0000
Gerrit-HasComments: Yes

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/14380/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14380/4//COMMIT_MSG@23
PS4, Line 23: - To benchmark, adjusted all_types-scan-correctness-test with
            :   - large number of rows 100k to 1M with run lengths of 1k to 10k
            :   - predicate that selects subset of rows
            :   - only RLE encoded integer datatypes
            :   Observed 40% to 60% perf improvement in scan compared to not doing decoder
            : level evaluation (i.e. materializing_iterator_decoder_eval = false)
Nice! Would be great if you could post the numbers describing what cases correspond to what times, if you have them available. That'd give some idea for how the improvement varies per selectivity, cardinality, etc.


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/common/rowblock.h
File src/kudu/common/rowblock.h:

http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/common/rowblock.h@217
PS4, Line 217:   // Clear "nrows" bits from the supplied "offset" in the current view.
             :   void ClearBits(size_t offset, size_t nrows) {
             :     DCHECK_LE(offset + nrows, sel_vec_->nrows() - row_offset_);
             :     BitmapChangeBits(sel_vec_->mutable_bitmap(), row_offset_ + offset, nrows,
             :         false);
             :   }
nit: if you reorder the args, you could set the default offset to 0 and get rid of the below method, right?


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/tablet/all_types-scan-correctness-test.cc
File src/kudu/tablet/all_types-scan-correctness-test.cc:

http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/tablet/all_types-scan-correctness-test.cc@237
PS4, Line 237:  
nit: extra space


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/tablet/all_types-scan-correctness-test.cc@239
PS4, Line 239: full
nit: I found this confusing because I thought there'd be 5 "full" chunks (by definition of a "chunk"). Maybe reword this as 3 "matching" chunks?


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/tablet/all_types-scan-correctness-test.cc@262
PS4, Line 262:        matching_chunks_in_last_stride * matching_rows_per_chunk +
             :          remainder_matching_rows;
nit: seems to be a couple too many spaces



-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 4
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 10 Oct 2019 21:32:22 +0000
Gerrit-HasComments: Yes

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 7: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 7
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 14 Oct 2019 23:45:59 +0000
Gerrit-HasComments: No

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................

[cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

This change adds optimization that evaluates the predicate for each run
instead of materializing each cell and then applying the predicate
for integer datatype RLE decoder.

Added a utility method in SelectionVectorView to clear bits
from caller specified offset. This helps clear batch of rows
from a caller maintained offset without advancing the internal
row_offset in SelectionVectorView.

Tests:

To benchmark, adjusted the all_types-scan-correctness-test and
tested with 1M rows and run-length of 10k repeating
integer values on a release build.

Following are results with different predicate values where 1st line
is scan time duration with decoder level evaluation and 2nd line
is scan time duration with decoder level evaluation turned off
(i.e. --materializing_iterator_decoder_eval = false)

Small subset: [5 10)
3-5ms
9-12ms

Large subset ~50%: [2000 - 7000)
3-5ms
8-9ms

Select All: [1, 10001)
12-15ms
18-22ms

Select None: [10001 10003)
3-5ms
9-12ms

Biggest improvement of around %60 is seen in cases where small subset or
no rows are selected. As more rows are selected the improvement reduces to
40-50%.

Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Reviewed-on: http://gerrit.cloudera.org:8080/14380
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
M src/kudu/cfile/rle_block.h
M src/kudu/common/rowblock.h
M src/kudu/tablet/all_types-scan-correctness-test.cc
3 files changed, 274 insertions(+), 53 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Andrew Wong: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 8
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/14380/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14380/4//COMMIT_MSG@23
PS4, Line 23: - To benchmark, adjusted all_types-scan-correctness-test with
            :   - large number of rows 100k to 1M with run lengths of 1k to 10k
            :   - predicate that selects subset of rows
            :   - only RLE encoded integer datatypes
            :   Observed 40% to 60% perf improvement in scan compared to not doing decoder
            : level evaluation (i.e. materializing_iterator_decoder_eval = false)
> Nice! Would be great if you could post the numbers describing what cases co
I basically took one AllNonNull test case with default 3 and 11 as lower and upper range. Switching from a loop that cleared individual bit to clearing bits in a batch by adding a utility method in SelectionVectorView made the difference.

Let me run more tests and add details.


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/cfile/rle_block.h
File src/kudu/cfile/rle_block.h:

http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/cfile/rle_block.h@426
PS4, Line 426: virtual
> nit: drop virtual since it's already an override
Done


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/cfile/rle_block.h@450
PS4, Line 450: void *
> nit: void*
Done


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/common/rowblock.h
File src/kudu/common/rowblock.h:

http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/common/rowblock.h@217
PS4, Line 217:   // Clear "nrows" bits from the supplied "offset" in the current view.
             :   void ClearBits(size_t offset, size_t nrows) {
             :     DCHECK_LE(offset + nrows, sel_vec_->nrows() - row_offset_);
             :     BitmapChangeBits(sel_vec_->mutable_bitmap(), row_offset_ + offset, nrows,
             :         false);
             :   }
> nit: if you reorder the args, you could set the default offset to 0 and get
Good point. Done.


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/tablet/all_types-scan-correctness-test.cc
File src/kudu/tablet/all_types-scan-correctness-test.cc:

http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/tablet/all_types-scan-correctness-test.cc@237
PS4, Line 237:  
> nit: extra space
Done


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/tablet/all_types-scan-correctness-test.cc@239
PS4, Line 239: full
> nit: I found this confusing because I thought there'd be 5 "full" chunks (b
Done


http://gerrit.cloudera.org:8080/#/c/14380/4/src/kudu/tablet/all_types-scan-correctness-test.cc@262
PS4, Line 262:        matching_chunks_in_last_stride * matching_rows_per_chunk +
             :          remainder_matching_rows;
> nit: seems to be a couple too many spaces
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 4
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 11 Oct 2019 00:10:50 +0000
Gerrit-HasComments: Yes

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 5: Code-Review+1

Holding off on +2ing in case you wanted to add more benchmarking results.


-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 5
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 11 Oct 2019 00:26:21 +0000
Gerrit-HasComments: No

[kudu-CR] [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14380 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14380
Gerrit-PatchSet: 5
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 11 Oct 2019 00:25:42 +0000
Gerrit-HasComments: No