You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Quanlong Huang (Code Review)" <ge...@cloudera.org> on 2022/04/01 01:38:41 UTC

[Impala-ASF-CR] IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

Hello Riza Suminto, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18366

to look at the new patch set (#6).

Change subject: IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue
......................................................................

IMPALA-11204: Template implementation for OrcStringColumnReader::ReadValue

There are some checks in OrcStringColumnReader::ReadValue() that we can
determine outside the scope of this method. They should be optimized
since this is a critical method that will be executed for each row (and
for each string column). With these checks, the method is too complex to
be inlined in OrcBatchedReader::ReadValueBatch() by the compiler.

This patch templates OrcStringColumnReader::ReadValue() with two
parameters, one for the target slot type (i.e. STRING/CHAR/VARCHAR),
and the other one for whether the column is dictionary encoded. Also
adds an ALWAYS_INLINE marker to force inlining it.

OrcStringColumnReader::ReadValueBatch() will call a template version of
ReadValue() based on the slot type and the orc batch encoded state.

Ran a single node perf test on TPCH(30) on my dev box using 3 impalad
instances. There are some improvements and no significant regressions:
+----------+--------+-------------+------------+
| Query    | Avg(s) | Base Avg(s) | Delta(Avg) |
+----------+--------+-------------+------------+
| TPCH-Q19 | 5.62   | 6.07        | I -7.41%   |
| TPCH-Q6  | 2.56   | 2.78        | I -7.77%   |
| TPCH-Q4  | 3.85   | 4.25        | I -9.42%   |
| TPCH-Q12 | 4.25   | 4.99        | I -14.78%  |
+----------+--------+-------------+------------+
Base commit: ff21728
File Format: orc/snap/block
Iterations: 30

Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
---
M be/src/exec/orc-column-readers.cc
M be/src/exec/orc-column-readers.h
2 files changed, 78 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/18366/6
-- 
To view, visit http://gerrit.cloudera.org:8080/18366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5e5f88c28059fb3d3ac1172e6d383d06ee3bedd5
Gerrit-Change-Number: 18366
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>