You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Quanlong Huang (Code Review)" <ge...@cloudera.org> on 2022/03/30 06:40:09 UTC

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Quanlong Huang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18366


Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................

IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

There are some checks in OrcStringColumnReader::ReadValue() that we can
determine outside the scope of this method. They should be optimized
since this is a critical method that will be executed for each row (and
for each string column). With these checks, the method is too complex to
be inlined in OrcBatchedReader::ReadValueBatch() by the compiler.

This patch templates OrcStringColumnReader::ReadValue() with two
parameters, one for the target slot type (i.e. STRING/CHAR/VARCHAR),
ther other for whether the column is dictionary encoded. Also adds an
ALWAYS_INLINE marker to force inlining it.

OrcStringColumnReader::ReadValueBatch() will call a template version of
ReadValue() based on the slot type and the orc batch encoded state.

Ran a single node perf test on TPCH(30) on my dev box using 3 impalad
instances. There are some improvements and no significant regressions:
+----------+--------+-------------+------------+
| Query    | Avg(s) | Base Avg(s) | Delta(Avg) |
+----------+--------+-------------+------------+
| TPCH-Q19 | 5.62   | 6.07        | I -7.41%   |
| TPCH-Q6  | 2.56   | 2.78        | I -7.77%   |
| TPCH-Q4  | 3.85   | 4.25        | I -9.42%   |
| TPCH-Q12 | 4.25   | 4.99        | I -14.78%  |
+----------+--------+-------------+------------+
Base commit: ff21728
File Format: orc/snap/block
Iterations: 30

Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
---
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
2 files changed, 62 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/18366/1
-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 4: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18366/4/be/src/exec/orc-column-readers.cc
File be/src/exec/orc-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/18366/4/be/src/exec/orc-column-readers.cc@184
PS4, Line 184: READ_STRING_BATCH
If it doesn't turn out to be too complicated, I would prefer to use a templated function instead.


http://gerrit.cloudera.org:8080/#/c/18366/4/be/src/exec/orc-column-readers.cc@228
PS4, Line 228: slot_type
I don't know whether we have a rule for this, but I would prefer SLOT_TYPE to avoid looking like a simple variable.



-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Wed, 30 Mar 2022 16:29:57 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Apr 2022 12:43:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Riza Suminto, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18366

to look at the new patch set (#7).

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................

IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

There are some checks in OrcStringColumnReader::ReadValue() that we can
determine outside the scope of this method. They should be optimized
since this is a critical method that will be executed for each row (and
for each string column). With these checks, the method is too complex to
be inlined in OrcBatchedReader::ReadValueBatch() by the compiler.

This patch templates OrcStringColumnReader::ReadValue() with two
parameters, one for the target slot type (i.e. STRING/CHAR/VARCHAR),
and the other one for whether the column is dictionary encoded. Also
adds an ALWAYS_INLINE marker to force inlining it.

OrcStringColumnReader::ReadValueBatch() will call a template version of
ReadValue() based on the slot type and the orc batch encoded state.

Ran a single node perf test on TPCH(30) on my dev box using 3 impalad
instances. There are some improvements and no significant regressions:
+----------+--------+-------------+------------+
| Query    | Avg(s) | Base Avg(s) | Delta(Avg) |
+----------+--------+-------------+------------+
| TPCH-Q19 | 5.62   | 6.07        | I -7.41%   |
| TPCH-Q6  | 2.56   | 2.78        | I -7.77%   |
| TPCH-Q4  | 3.85   | 4.25        | I -9.42%   |
| TPCH-Q12 | 4.25   | 4.99        | I -14.78%  |
+----------+--------+-------------+------------+
Base commit: ff21728
File Format: orc/snap/block
Iterations: 30

Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
---
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
2 files changed, 95 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/18366/7
-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8011/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Sun, 03 Apr 2022 09:16:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 10: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Apr 2022 04:02:00 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8021/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Apr 2022 23:09:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 6:

Merging this. Thanks for your reviews!


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Apr 2022 14:16:37 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18366/4/be/src/exec/orc-column-readers.cc
File be/src/exec/orc-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/18366/4/be/src/exec/orc-column-readers.cc@184
PS4, Line 184: <PrimitiveType SL
> If it doesn't turn out to be too complicated, I would prefer to use a templ
Done


http://gerrit.cloudera.org:8080/#/c/18366/4/be/src/exec/orc-column-readers.cc@228
PS4, Line 228: 
> I don't know whether we have a rule for this, but I would prefer SLOT_TYPE 
I can't find a style guide about this in https://google.github.io/styleguide/cppguide.html

I fount that we have both cases:
https://github.com/apache/impala/blob/b1c1be12f3cceed48e93eddae8b9512737e3e0d2/be/src/exec/parquet/parquet-column-readers.cc#L56
https://github.com/apache/impala/blob/b1c1be12f3cceed48e93eddae8b9512737e3e0d2/be/src/exprs/date-functions.h#L132

I'm ok to use uppercase.



-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 31 Mar 2022 00:56:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 1:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/10356/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 30 Mar 2022 06:59:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8022/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Apr 2022 04:02:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10368/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 31 Mar 2022 01:16:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8016/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Apr 2022 12:37:48 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10376/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Apr 2022 01:59:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................

IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

There are some checks in OrcStringColumnReader::ReadValue() that we can
determine outside the scope of this method. They should be optimized
since this is a critical method that will be executed for each row (and
for each string column). With these checks, the method is too complex to
be inlined in OrcBatchedReader::ReadValueBatch() by the compiler.

This patch templates OrcStringColumnReader::ReadValue() with two
parameters, one for the target slot type (i.e. STRING/CHAR/VARCHAR),
and the other one for whether the column is dictionary encoded. Also
adds an ALWAYS_INLINE marker to force inlining it.

OrcStringColumnReader::ReadValueBatch() will call a template version of
ReadValue() based on the slot type and the orc batch encoded state.

Ran a single node perf test on TPCH(30) on my dev box using 3 impalad
instances. There are some improvements and no significant regressions:
+----------+--------+-------------+------------+
| Query    | Avg(s) | Base Avg(s) | Delta(Avg) |
+----------+--------+-------------+------------+
| TPCH-Q19 | 5.62   | 6.07        | I -7.41%   |
| TPCH-Q6  | 2.56   | 2.78        | I -7.77%   |
| TPCH-Q4  | 3.85   | 4.25        | I -9.42%   |
| TPCH-Q12 | 4.25   | 4.99        | I -14.78%  |
+----------+--------+-------------+------------+
Base commit: ff21728
File Format: orc/snap/block
Iterations: 30

Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Reviewed-on: http://gerrit.cloudera.org:8080/18366
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
2 files changed, 95 insertions(+), 4 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18366

to look at the new patch set (#4).

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................

IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

There are some checks in OrcStringColumnReader::ReadValue() that we can
determine outside the scope of this method. They should be optimized
since this is a critical method that will be executed for each row (and
for each string column). With these checks, the method is too complex to
be inlined in OrcBatchedReader::ReadValueBatch() by the compiler.

This patch templates OrcStringColumnReader::ReadValue() with two
parameters, one for the target slot type (i.e. STRING/CHAR/VARCHAR),
ther other for whether the column is dictionary encoded. Also adds an
ALWAYS_INLINE marker to force inlining it.

OrcStringColumnReader::ReadValueBatch() will call a template version of
ReadValue() based on the slot type and the orc batch encoded state.

Ran a single node perf test on TPCH(30) on my dev box using 3 impalad
instances. There are some improvements and no significant regressions:
+----------+--------+-------------+------------+
| Query    | Avg(s) | Base Avg(s) | Delta(Avg) |
+----------+--------+-------------+------------+
| TPCH-Q19 | 5.62   | 6.07        | I -7.41%   |
| TPCH-Q6  | 2.56   | 2.78        | I -7.77%   |
| TPCH-Q4  | 3.85   | 4.25        | I -9.42%   |
| TPCH-Q12 | 4.25   | 4.99        | I -14.78%  |
+----------+--------+-------------+------------+
Base commit: ff21728
File Format: orc/snap/block
Iterations: 30

Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
---
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
2 files changed, 66 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/18366/4
-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18366

to look at the new patch set (#3).

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................

IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

There are some checks in OrcStringColumnReader::ReadValue() that we can
determine outside the scope of this method. They should be optimized
since this is a critical method that will be executed for each row (and
for each string column). With these checks, the method is too complex to
be inlined in OrcBatchedReader::ReadValueBatch() by the compiler.

This patch templates OrcStringColumnReader::ReadValue() with two
parameters, one for the target slot type (i.e. STRING/CHAR/VARCHAR),
ther other for whether the column is dictionary encoded. Also adds an
ALWAYS_INLINE marker to force inlining it.

OrcStringColumnReader::ReadValueBatch() will call a template version of
ReadValue() based on the slot type and the orc batch encoded state.

Ran a single node perf test on TPCH(30) on my dev box using 3 impalad
instances. There are some improvements and no significant regressions:
+----------+--------+-------------+------------+
| Query    | Avg(s) | Base Avg(s) | Delta(Avg) |
+----------+--------+-------------+------------+
| TPCH-Q19 | 5.62   | 6.07        | I -7.41%   |
| TPCH-Q6  | 2.56   | 2.78        | I -7.77%   |
| TPCH-Q4  | 3.85   | 4.25        | I -9.42%   |
| TPCH-Q12 | 4.25   | 4.99        | I -14.78%  |
+----------+--------+-------------+------------+
Base commit: ff21728
File Format: orc/snap/block
Iterations: 30

Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
---
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
2 files changed, 66 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/18366/3
-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Riza Suminto, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18366

to look at the new patch set (#5).

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................

IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

There are some checks in OrcStringColumnReader::ReadValue() that we can
determine outside the scope of this method. They should be optimized
since this is a critical method that will be executed for each row (and
for each string column). With these checks, the method is too complex to
be inlined in OrcBatchedReader::ReadValueBatch() by the compiler.

This patch templates OrcStringColumnReader::ReadValue() with two
parameters, one for the target slot type (i.e. STRING/CHAR/VARCHAR),
ther other for whether the column is dictionary encoded. Also adds an
ALWAYS_INLINE marker to force inlining it.

OrcStringColumnReader::ReadValueBatch() will call a template version of
ReadValue() based on the slot type and the orc batch encoded state.

Ran a single node perf test on TPCH(30) on my dev box using 3 impalad
instances. There are some improvements and no significant regressions:
+----------+--------+-------------+------------+
| Query    | Avg(s) | Base Avg(s) | Delta(Avg) |
+----------+--------+-------------+------------+
| TPCH-Q19 | 5.62   | 6.07        | I -7.41%   |
| TPCH-Q6  | 2.56   | 2.78        | I -7.77%   |
| TPCH-Q4  | 3.85   | 4.25        | I -9.42%   |
| TPCH-Q12 | 4.25   | 4.99        | I -14.78%  |
+----------+--------+-------------+------------+
Base commit: ff21728
File Format: orc/snap/block
Iterations: 30

Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
---
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
2 files changed, 77 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/18366/5
-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 7: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Sun, 03 Apr 2022 13:43:17 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18366/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18366/5//COMMIT_MSG@17
PS5, Line 17: and the ot
> nit: and the other one
Done


http://gerrit.cloudera.org:8080/#/c/18366/5/be/src/exec/orc-column-readers.h
File be/src/exec/orc-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/18366/5/be/src/exec/orc-column-readers.h@353
PS5, Line 353: we still need to
> nit: we still need to override this.
Done


http://gerrit.cloudera.org:8080/#/c/18366/5/be/src/exec/orc-column-readers.cc
File be/src/exec/orc-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/18366/5/be/src/exec/orc-column-readers.cc@241
PS5, Line 241:   if (ENCODED) {
> Patch set 5 still have the branches in similar places as the unpatched vers
Yeah, for instance, OrcStringColumnReader::ReadValueInternal<TYPE_STRING, true> will have this as

  if (true) {
    ...
  else {
    ...
  }

Dead codes will be removed by the compiler.



-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Apr 2022 01:39:04 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10387/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Sun, 03 Apr 2022 09:13:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Riza Suminto (Code Review)" <ge...@cloudera.org>.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 5: Code-Review+1

(3 comments)

The code looks good to me.
Just have couple nits and questions.

http://gerrit.cloudera.org:8080/#/c/18366/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18366/5//COMMIT_MSG@17
PS5, Line 17: ther other
nit: and the other one


http://gerrit.cloudera.org:8080/#/c/18366/5/be/src/exec/orc-column-readers.h
File be/src/exec/orc-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/18366/5/be/src/exec/orc-column-readers.h@353
PS5, Line 353: we still need this.
nit: we still need to override this.


http://gerrit.cloudera.org:8080/#/c/18366/5/be/src/exec/orc-column-readers.cc
File be/src/exec/orc-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/18366/5/be/src/exec/orc-column-readers.cc@241
PS5, Line 241:   if (ENCODED) {
Patch set 5 still have the branches in similar places as the unpatched version of OrcStringColumnReader::ReadValue().
Am I correct to assume that compiler will remove these branches upon inlining?



-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Thu, 31 Mar 2022 19:16:17 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8015/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Apr 2022 07:03:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 8: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Apr 2022 07:03:10 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 10: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Apr 2022 08:38:52 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10358/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 30 Mar 2022 07:34:24 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 3:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/10357/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 30 Mar 2022 07:33:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Hello Riza Suminto, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18366

to look at the new patch set (#6).

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................

IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

There are some checks in OrcStringColumnReader::ReadValue() that we can
determine outside the scope of this method. They should be optimized
since this is a critical method that will be executed for each row (and
for each string column). With these checks, the method is too complex to
be inlined in OrcBatchedReader::ReadValueBatch() by the compiler.

This patch templates OrcStringColumnReader::ReadValue() with two
parameters, one for the target slot type (i.e. STRING/CHAR/VARCHAR),
and the other one for whether the column is dictionary encoded. Also
adds an ALWAYS_INLINE marker to force inlining it.

OrcStringColumnReader::ReadValueBatch() will call a template version of
ReadValue() based on the slot type and the orc batch encoded state.

Ran a single node perf test on TPCH(30) on my dev box using 3 impalad
instances. There are some improvements and no significant regressions:
+----------+--------+-------------+------------+
| Query    | Avg(s) | Base Avg(s) | Delta(Avg) |
+----------+--------+-------------+------------+
| TPCH-Q19 | 5.62   | 6.07        | I -7.41%   |
| TPCH-Q6  | 2.56   | 2.78        | I -7.77%   |
| TPCH-Q4  | 3.85   | 4.25        | I -9.42%   |
| TPCH-Q12 | 4.25   | 4.99        | I -14.78%  |
+----------+--------+-------------+------------+
Base commit: ff21728
File Format: orc/snap/block
Iterations: 30

Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
---
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
2 files changed, 78 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/18366/6
-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8004/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Apr 2022 14:17:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8004/


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Apr 2022 18:35:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 8: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8015/


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Apr 2022 11:34:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 8:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8016/


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 8
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Apr 2022 17:04:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 9:

> Patch Set 9: Verified-1
> 
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8021/

The failure is IMPALA-11161.


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Apr 2022 04:01:39 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 9: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8021/


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Apr 2022 03:43:15 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Quanlong Huang (Code Review)" <ge...@cloudera.org>.
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18366/6/be/src/exec/orc-column-readers.h
File be/src/exec/orc-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/18366/6/be/src/exec/orc-column-readers.h@355
PS6, Line 355:   Status ReadValue(int row_idx, Tuple* tuple, MemPool* pool) override WARN_UNUSED_RESULT {
Ah, this is still used by reading collections (i.e. array, map): 
https://github.com/apache/impala/blob/8e755e757101a85788f8c9dc8f292b471ba0752d/be/src/exec/orc-column-readers.cc#L716
https://github.com/apache/impala/blob/8e755e757101a85788f8c9dc8f292b471ba0752d/be/src/exec/orc-column-readers.cc#L814



-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Apr 2022 23:51:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 7: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Apr 2022 06:58:49 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18366 )

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................


Patch Set 9: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Apr 2022 23:09:48 +0000
Gerrit-HasComments: No