You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Amogh Margoor (Code Review)" <ge...@cloudera.org> on 2021/11/01 17:51:22 UTC
[Impala-ASF-CR] IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 )
Change subject: IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.
......................................................................
Patch Set 19:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/17860/12//COMMIT_MSG
Commit Message:
http://gerrit.cloudera.org:8080/#/c/17860/12//COMMIT_MSG@24
PS12, Line 24: TPCH scale 42
> I think it would be good to execute the whole benchmark with bin/single_nod
Hi Zoltan,
Sorry for the delay with benchmark. I ran the entire tpch bechmark at scale 42. This was the summary of report (Delta is the change).
Report Generated on 2021-10-28
Run Description: "78ce235db6d5b720f3e3319ff571a2da054a2602 vs c46d765dccd5739c848d8c1c82043e72394b8397"
Cluster Name: UNKNOWN
Lab Run Info: UNKNOWN
Impala Version: impalad version 4.1.0-SNAPSHOT RELEASE (2021-10-28)
Baseline Impala Version: impalad version 4.1.0-SNAPSHOT RELEASE (2021-10-27)
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / none / none | 12.83 | -1.54% | 8.26 | -1.48% |
+----------+-----------------------+---------+------------+------------+----------------+
Very slight improvement overall and major improvements in these 2 queries:
(I) Improvement: TPCH(42) TPCH-Q6 [parquet / none / none] (1.85s -> 1.72s [-7.30%])
+--------------+------------+-------+----------+------------+-----------+-------+----------+------------+--------+-------+-------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows |
+--------------+------------+-------+----------+------------+-----------+-------+----------+------------+--------+-------+-------+-----------+
| 00:SCAN HDFS | 94.83% | 1.50s | 1.62s | -7.75% | 2.07% | 1.56s | 1.73s | -9.58% | 1 | 1 | 4.79M | 29.96M |
+--------------+------------+-------+----------+------------+-----------+-------+----------+------------+--------+-------+-------+-----------+
(I) Improvement: TPCH(42) TPCH-Q19 [parquet / none / none] (4.73s -> 4.18s [-11.72%])
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+--------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows |
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+--------+-----------+
| 01:SCAN HDFS | 22.68% | 729.91ms | 736.69ms | -0.92% | 1.61% | 751.55ms | 747.34ms | +0.56% | 1 | 1 | 20.33K | 1.50M |
| 00:SCAN HDFS | 74.84% | 2.41s | 2.97s | -18.98% | 0.67% | 2.44s | 3.00s | -18.70% | 1 | 1 | 13.07K | 29.96M |
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+--------+-----------+
There was no regression reported as such just these 2 improvements and couple of queries with high variability in runtime (not related to our change).
--
To view, visit http://gerrit.cloudera.org:8080/17860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Gerrit-Change-Number: 17860
Gerrit-PatchSet: 19
Gerrit-Owner: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Amogh Margoor <am...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <bo...@cloudera.com>
Gerrit-Comment-Date: Mon, 01 Nov 2021 17:51:22 +0000
Gerrit-HasComments: Yes