You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jackey Lee (Jira)" <ji...@apache.org> on 2023/02/20 02:02:00 UTC

[jira] (SPARK-38427) DataFilter pushed down with PartitionFilter for Orc

    [ https://issues.apache.org/jira/browse/SPARK-38427 ]


    Jackey Lee deleted comment on SPARK-38427:
    ------------------------------------

was (Author: jackey lee):
[~LuciferYang] 

> DataFilter pushed down with PartitionFilter for Orc
> ---------------------------------------------------
>
>                 Key: SPARK-38427
>                 URL: https://issues.apache.org/jira/browse/SPARK-38427
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Jackey Lee
>            Priority: Major
>
> At present, for orc data source, the Filter is divided into DataFilter and PartitionFilter when it is pushed down, but when the Filter removes the PartitionFilter, it means that all Partitions will scan all DataFilter conditions, which may cause full data scan.
> Based on SPARK-38041, we can pushdown dataFilter with partitionFilter to ORC, and remove partitionFilter at runtime.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org