You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Gergely Fürnstáhl (Jira)" <ji...@apache.org> on 2022/09/12 13:54:00 UTC

[jira] [Created] (IMPALA-11577) Optimize getting stored file types for Iceberg tables

Gergely Fürnstáhl created IMPALA-11577:
------------------------------------------

             Summary: Optimize getting stored file types for Iceberg tables
                 Key: IMPALA-11577
                 URL: https://issues.apache.org/jira/browse/IMPALA-11577
             Project: IMPALA
          Issue Type: Improvement
            Reporter: Gergely Fürnstáhl


Impala supports mixed file formats for Iceberg tables, which means every file can have different file format and it uses the set of existing file formats for planning purposes. Currently Impala goes through all file's metadata to aggregate this information, which can be slow if there are lots of data files.

We could optimized this by storing this aggregated information somewhere (e.g. in Iceberg - yet to be implemented - https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/SnapshotSummary.java)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)