You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/06/28 19:02:00 UTC

[jira] [Commented] (IMPALA-11279) Optimize count(*) queries for Iceberg tables

|  ![](cid:jira-generated-image-avatar-0c1be625-6d6d-44d9-b371-293f5a8e787c) |
[ASF subversion and git
services](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=jira-
bot) **commented** on [![Improvement](cid:jira-generated-image-
avatar-2af65ce5-3647-400e-803f-4109064026a7)
IMPALA-11279](https://issues.apache.org/jira/browse/IMPALA-11279)  
---|---  
|  
---  
|  [Re: Optimize count(*) queries for Iceberg
tables](https://issues.apache.org/jira/browse/IMPALA-11279)  
---  
|

Commit f38c53235f1797f91ff9a65bb734d3f38f1aadc9 in impala's branch
refs/heads/master from LPL  
[ <https://gitbox.apache.org/repos/asf?p=impala.git;h=f38c53235> ]

[IMPALA-11279](https://issues.apache.org/jira/browse/IMPALA-11279 "Optimize
count\(*\) queries for Iceberg tables"): Optimize plain
count![star_yellow.png](cid:jira-generated-image-static-star_yellow-
fce4c70b-a066-46f8-b2f5-4270e59600b7) queries for Iceberg tables

This commit optimizes the plain count![star_yellow.png](cid:jira-generated-
image-static-star_yellow-fce4c70b-a066-46f8-b2f5-4270e59600b7) queries for the
Iceberg tables.  
When the `org.apache.iceberg.SnapshotSummary#TOTAL_RECORDS_PROP` can be  
retrieved from the current `org.apache.iceberg.BaseSnapshot#summary` of  
the Iceberg table, this kind of query can be very fast. If this property  
is not retrieved, the query will aggregate the `num_rows` of parquet  
`file_metadata_` as usual.

Queries that can be optimized need to meet the following requirements:

  * SelectStmt does not have WHERE clause
  * SelectStmt does not have GROUP BY clause
  * SelectStmt does not have HAVING clause
  * The TableRefs of FROM clause contains only one BaseTableRef
  * Only for the Iceberg table
  * SelectList must contain 'count![star_yellow.png](cid:jira-generated-image-static-star_yellow-fce4c70b-a066-46f8-b2f5-4270e59600b7)' or 'count(constant)'
  * SelectList can contain other agg functions, e.g. min, sum, etc
  * SelectList can contain constant

Testing:

  * Added end-to-end test
  * Existing tests
  * Test it in a real cluster

Change-Id: I8e9c48bbba7ab2320fa80915e7001ce54f1ef6d9  
Reviewed-on: <http://gerrit.cloudera.org:8080/18574>  
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>  
Tested-by: Impala Public Jenkins <im...@cloudera.com>  
  
---  
|  |  [ ![Add Comment](cid:jira-generated-image-static-comment-
icon-d1b0b79e-a9e9-4282-a3fa-6137484cf678)
](https://issues.apache.org/jira/browse/IMPALA-11279#add-comment "Add
Comment") |  [Add
Comment](https://issues.apache.org/jira/browse/IMPALA-11279#add-comment "Add
Comment")  
---|---  
  
|  This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9) |  |
![Atlassian logo](https://issues.apache.org/jira/images/mail/atlassian-email-
logo.png)  
---