You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Riza Suminto (Code Review)" <ge...@cloudera.org> on 2022/04/01 03:20:46 UTC

[Impala-ASF-CR] IMPALA-11123: Optimize count(star) for ORC scans

Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18327 )

Change subject: IMPALA-11123: Optimize count(star) for ORC scans
......................................................................


Patch Set 9:

(2 comments)

There are two place that I'm still unsure about:

http://gerrit.cloudera.org:8080/#/c/18327/9/be/src/exec/hdfs-columnar-scanner.cc
File be/src/exec/hdfs-columnar-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/18327/9/be/src/exec/hdfs-columnar-scanner.cc@319
PS9, Line 319:   COUNTER_ADD(scan_node_->rows_read_counter(), num_rows);
We might want to remove this counter increment to avoid confusion.
This counter imply that we're iterating rows when in fact we're not.
Similarly as the one in GetNextWithTemplateTuple().


http://gerrit.cloudera.org:8080/#/c/18327/9/tests/query_test/test_aggregation.py
File tests/query_test/test_aggregation.py:

http://gerrit.cloudera.org:8080/#/c/18327/9/tests/query_test/test_aggregation.py@260
PS9, Line 260: test_parquet_count_star_optimization
I think I want to take this test along with test_kudu_count_star_optimization, test_orc_count_star_optimization, and test_sampled_ndv into a standalone class.
The reason is, they are really intended to just run once, and currently we enforce that by calling pytest.skip().
If I run TestAggregationQueries with exhaustive exploration, I will see several lines of skipped tests. Some are also slow, I suspect because they need to create unique_database first, only to be skipped later.



-- 
To view, visit http://gerrit.cloudera.org:8080/18327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fafa1182f97323aeb9ee39dd4e8ecd418fa6091
Gerrit-Change-Number: 18327
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto <ri...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Riza Suminto <ri...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Apr 2022 03:20:46 +0000
Gerrit-HasComments: Yes