You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Kazuyuki Tanimura (Jira)" <ji...@apache.org> on 2022/03/16 19:35:00 UTC

[jira] [Created] (SPARK-38573) Support Partition Level Statistics Collection

Kazuyuki Tanimura created SPARK-38573:
-----------------------------------------

             Summary: Support Partition Level Statistics Collection
                 Key: SPARK-38573
                 URL: https://issues.apache.org/jira/browse/SPARK-38573
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.3.0
            Reporter: Kazuyuki Tanimura


Currently https://issues.apache.org/jira/browse/SPARK-21127 supports storing the aggregated stats at table level for partitioned tables with config spark.sql.statistics.size.autoUpdate.enabled.

Supporting partition level stats are useful to know which partitions are outliers (skewed partition) and query optimizer works better with partition level stats in case of partition pruning.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org