You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Grant Henke (Code Review)" <ge...@cloudera.org> on 2019/10/15 14:58:03 UTC

[kudu-CR](branch-1.11.x) [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Grant Henke has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14453


Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................

[cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

This change adds optimization that evaluates the predicate for each run
instead of materializing each cell and then applying the predicate
for integer datatype RLE decoder.

Added a utility method in SelectionVectorView to clear bits
from caller specified offset. This helps clear batch of rows
from a caller maintained offset without advancing the internal
row_offset in SelectionVectorView.

Tests:

To benchmark, adjusted the all_types-scan-correctness-test and
tested with 1M rows and run-length of 10k repeating
integer values on a release build.

Following are results with different predicate values where 1st line
is scan time duration with decoder level evaluation and 2nd line
is scan time duration with decoder level evaluation turned off
(i.e. --materializing_iterator_decoder_eval = false)

Small subset: [5 10)
3-5ms
9-12ms

Large subset ~50%: [2000 - 7000)
3-5ms
8-9ms

Select All: [1, 10001)
12-15ms
18-22ms

Select None: [10001 10003)
3-5ms
9-12ms

Biggest improvement of around %60 is seen in cases where small subset or
no rows are selected. As more rows are selected the improvement reduces to
40-50%.

Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Reviewed-on: http://gerrit.cloudera.org:8080/14380
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>
(cherry picked from commit 384a535a0079ec634d57b55593962cae4cb6f19a)
---
M src/kudu/cfile/rle_block.h
M src/kudu/common/rowblock.h
M src/kudu/tablet/all_types-scan-correctness-test.cc
3 files changed, 274 insertions(+), 53 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/14453/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14453
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.11.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14453
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>

[kudu-CR](branch-1.11.x) [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14453 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................

[cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

This change adds optimization that evaluates the predicate for each run
instead of materializing each cell and then applying the predicate
for integer datatype RLE decoder.

Added a utility method in SelectionVectorView to clear bits
from caller specified offset. This helps clear batch of rows
from a caller maintained offset without advancing the internal
row_offset in SelectionVectorView.

Tests:

To benchmark, adjusted the all_types-scan-correctness-test and
tested with 1M rows and run-length of 10k repeating
integer values on a release build.

Following are results with different predicate values where 1st line
is scan time duration with decoder level evaluation and 2nd line
is scan time duration with decoder level evaluation turned off
(i.e. --materializing_iterator_decoder_eval = false)

Small subset: [5 10)
3-5ms
9-12ms

Large subset ~50%: [2000 - 7000)
3-5ms
8-9ms

Select All: [1, 10001)
12-15ms
18-22ms

Select None: [10001 10003)
3-5ms
9-12ms

Biggest improvement of around %60 is seen in cases where small subset or
no rows are selected. As more rows are selected the improvement reduces to
40-50%.

Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Reviewed-on: http://gerrit.cloudera.org:8080/14380
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>
(cherry picked from commit 384a535a0079ec634d57b55593962cae4cb6f19a)
Reviewed-on: http://gerrit.cloudera.org:8080/14453
Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
M src/kudu/cfile/rle_block.h
M src/kudu/common/rowblock.h
M src/kudu/tablet/all_types-scan-correctness-test.cc
3 files changed, 274 insertions(+), 53 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/14453
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.11.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14453
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR](branch-1.11.x) [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14453 )

Change subject: [cfile] KUDU-2852 Push predicate evaluation for int type RLE decoder
......................................................................


Patch Set 1: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14453
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.11.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I6e05775ec1301d3d0b0365a7704b8e962a20455e
Gerrit-Change-Number: 14453
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 15 Oct 2019 16:47:38 +0000
Gerrit-HasComments: No