You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Tim Armstrong (Code Review)" <ge...@cloudera.org> on 2019/09/03 06:27:10 UTC

[Impala-ASF-CR] IMPALA-5802: use mt scan node for all formats

Tim Armstrong has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/14171 )

Change subject: IMPALA-5802: use mt scan node for all formats
......................................................................

IMPALA-5802: use mt scan node for all formats

This remove special-casing of the sequence-based
file formats (Avro, RC, Seq) where they used the
legacy scan node instead of the multi-threaded
scan node when mt_dop was enabled.

There was no particular reason to do this: the
code path are all already in use and should be more
resource-efficient with multithreading enabled.

Testing:
Updated planner tests to reflect that MT scan
is used. Removed PARALLELPLANS for the Hive
3 Avro test because it does not provide
important coverage and required updating.

Performance:
Some targeted benchmarks showed no difference in
performance.

Query:
set mt_dop=4; select min(l_orderkey), min(l_comment) from lineitem;

tpch_avro Before: 0.51 0.41 0.51 0.41 0.51
tpch_avro After: 0.41 0.41 0.41 0.41 0.41
tpch_rc Before: 0.31 0.31 0.31 0.31 0.31
tpch_rc After: 0.31 0.31 0.31 0.31 0.31
tpch_seq_gzip Before: 2.32 2.22 2.22 2.22 2.32
tpch_seq_gzip After: 2.22 2.22 2.22 2.32 2.32

Query:
unset mt_dop; compute stats lineitem;

tpch_avro Before: 1.21 1.21 1.21 1.21 1.21
tpch_avro After: 1.21 1.31 1.21 1.31 1.21
tpch_rc Before: 1.31 1.41 1.31 1.31 1.31
tpch_rc After: 1.31 1.41 1.31 1.31 1.31
tpch_seq_gzip Before: 2.82 2.72 2.71 2.92 2.71
tpch_seq_gzip After: 2.82 2.82 2.81 2.71 2.92

Change-Id: I8a91d2e5c2ebb617b7643cd676cb3490c190a68a
---
M be/src/exec/exec-node.cc
M be/src/exec/hdfs-scan-node-mt.cc
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
5 files changed, 17 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/14171/3
-- 
To view, visit http://gerrit.cloudera.org:8080/14171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8a91d2e5c2ebb617b7643cd676cb3490c190a68a
Gerrit-Change-Number: 14171
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong <ta...@cloudera.com>