You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/06/28 19:02:00 UTC
[jira] [Commented] (IMPALA-11279) Optimize count(*) queries for Iceberg tables
| ![](cid:jira-generated-image-avatar-0c1be625-6d6d-44d9-b371-293f5a8e787c) |
[ASF subversion and git
services](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=jira-
bot) **commented** on [![Improvement](cid:jira-generated-image-
avatar-2af65ce5-3647-400e-803f-4109064026a7)
IMPALA-11279](https://issues.apache.org/jira/browse/IMPALA-11279)
---|---
|
---
| [Re: Optimize count(*) queries for Iceberg
tables](https://issues.apache.org/jira/browse/IMPALA-11279)
---
|
Commit f38c53235f1797f91ff9a65bb734d3f38f1aadc9 in impala's branch
refs/heads/master from LPL
[ <https://gitbox.apache.org/repos/asf?p=impala.git;h=f38c53235> ]
[IMPALA-11279](https://issues.apache.org/jira/browse/IMPALA-11279 "Optimize
count\(*\) queries for Iceberg tables"): Optimize plain
count![star_yellow.png](cid:jira-generated-image-static-star_yellow-
fce4c70b-a066-46f8-b2f5-4270e59600b7) queries for Iceberg tables
This commit optimizes the plain count![star_yellow.png](cid:jira-generated-
image-static-star_yellow-fce4c70b-a066-46f8-b2f5-4270e59600b7) queries for the
Iceberg tables.
When the `org.apache.iceberg.SnapshotSummary#TOTAL_RECORDS_PROP` can be
retrieved from the current `org.apache.iceberg.BaseSnapshot#summary` of
the Iceberg table, this kind of query can be very fast. If this property
is not retrieved, the query will aggregate the `num_rows` of parquet
`file_metadata_` as usual.
Queries that can be optimized need to meet the following requirements:
* SelectStmt does not have WHERE clause
* SelectStmt does not have GROUP BY clause
* SelectStmt does not have HAVING clause
* The TableRefs of FROM clause contains only one BaseTableRef
* Only for the Iceberg table
* SelectList must contain 'count![star_yellow.png](cid:jira-generated-image-static-star_yellow-fce4c70b-a066-46f8-b2f5-4270e59600b7)' or 'count(constant)'
* SelectList can contain other agg functions, e.g. min, sum, etc
* SelectList can contain constant
Testing:
* Added end-to-end test
* Existing tests
* Test it in a real cluster
Change-Id: I8e9c48bbba7ab2320fa80915e7001ce54f1ef6d9
Reviewed-on: <http://gerrit.cloudera.org:8080/18574>
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
| | [ ![Add Comment](cid:jira-generated-image-static-comment-
icon-d1b0b79e-a9e9-4282-a3fa-6137484cf678)
](https://issues.apache.org/jira/browse/IMPALA-11279#add-comment "Add
Comment") | [Add
Comment](https://issues.apache.org/jira/browse/IMPALA-11279#add-comment "Add
Comment")
---|---
| This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9) | |
![Atlassian logo](https://issues.apache.org/jira/images/mail/atlassian-email-
logo.png)
---