You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Asif (Jira)" <ji...@apache.org> on 2021/09/28 17:14:00 UTC

[jira] [Created] (SPARK-36878) Optimization in PushDownPredicates to push all filters in a single iteration has broken some optimizations in PruneFilter rule

Asif created SPARK-36878:
----------------------------

             Summary: Optimization in PushDownPredicates to push all filters in a single iteration has broken  some optimizations in PruneFilter rule
                 Key: SPARK-36878
                 URL: https://issues.apache.org/jira/browse/SPARK-36878
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.1.1
            Reporter: Asif


It appears that the optimization in PushDownPredicates rule to try to push all filters in a single pass to reduce iteration has broken the PruneFilter rule to substitute with EmptyRelation when the filter condition is a composite and statically evaluates to false either because one of the non redundant predicate is Literal(false) or all the non redundant predicates are null.

The new PushDownPredicate rule is created by chaining CombineFilters, PushPredicateThroughNonJoin and PushPredicateThroughJoin.

so individual filters will get combined as a single filter while being pushed.

But the PruneFilters rule does not substitute it with empty relation if the filter is composite. It is coded to handle single predicates.

The test is falsely passing as it is testing PushPredicateThroughNonJoin, which does not combine filters. 

While  the actual rule in action has an effect produced by CombineFilters. 

In fact I believe all the places in other tests which are testing individually for PushDownPredicateThroughNonJoin or PushDownPredicateThroughJoin should be corrected ( may be with rule PushPredicates) & re tested.

I will add a bug test & open PR.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org