You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hu Fuwang (Jira)" <ji...@apache.org> on 2020/01/15 07:02:00 UTC

[jira] [Updated] (SPARK-30516) statistic estimation of FileScan should take partitionFilters and partition number into account

     [ https://issues.apache.org/jira/browse/SPARK-30516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hu Fuwang updated SPARK-30516:
------------------------------
    Summary: statistic estimation of FileScan should take partitionFilters and partition number into account  (was: FileScan.estimateStatistics does not take partitionFilters and partition number into account)

> statistic estimation of FileScan should take partitionFilters and partition number into account
> -----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-30516
>                 URL: https://issues.apache.org/jira/browse/SPARK-30516
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Hu Fuwang
>            Priority: Major
>
> Currently, FileScan.estimateStatistics does not take partitionFilters and partition number into account, which may lead to bigger sizeInBytes. It should be reasonable to change it to involve partitionFilters and partition number when estimating the statistics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org