You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Zoltán Borók-Nagy (Jira)" <ji...@apache.org> on 2022/04/12 12:37:00 UTC

[jira] [Created] (IMPALA-11238) Avoid the need for COMPUTE STAST for Iceberg tables

Zoltán Borók-Nagy created IMPALA-11238:
------------------------------------------

             Summary: Avoid the need for COMPUTE STAST for Iceberg tables
                 Key: IMPALA-11238
                 URL: https://issues.apache.org/jira/browse/IMPALA-11238
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Zoltán Borók-Nagy


We still need to issue COMPUTE STATS for Iceberg tables to do proper planning.
The main reason for it that Iceberg metadata lacks NDV information about columns at the table level.

There are plans in Iceberg to store HyperLogLog arrays for data files, so once we have that we could use that information.

Until that maybe we could use some heuristics from Iceberg metadata when there is no precise NDV available.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)