You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hu Fuwang (Jira)" <ji...@apache.org> on 2020/01/15 06:41:00 UTC

[jira] [Updated] (SPARK-30516) FileScan.estimateStatistics does not take partitionFilters and partition number into account

     [ https://issues.apache.org/jira/browse/SPARK-30516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hu Fuwang updated SPARK-30516:
------------------------------
    Description: Currently, FileScan.estimateStatistics will not take partitionFilters into account, which may lead to bigger sizeInBytes. It should be reasonable to change it to involve partitionFilters and partition numbers when estimating the statistics.  (was: Currently, FileScan.estimateStatistics will not take partitionFilters into account, which may lead to bigger sizeInBytes.

It should be reasonable to change it to involve partitionFilters and partition numbers when estimating the statistics.)

> FileScan.estimateStatistics does not take partitionFilters and partition number into account
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-30516
>                 URL: https://issues.apache.org/jira/browse/SPARK-30516
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.0
>            Reporter: Hu Fuwang
>            Priority: Major
>
> Currently, FileScan.estimateStatistics will not take partitionFilters into account, which may lead to bigger sizeInBytes. It should be reasonable to change it to involve partitionFilters and partition numbers when estimating the statistics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org