You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/11/16 09:08:07 UTC

[GitHub] [spark] viirya commented on a diff in pull request #38511: [SPARK-41017][SQL] Support column pruning with multiple nondeterministic Filters

viirya commented on code in PR #38511:
URL: https://github.com/apache/spark/pull/38511#discussion_r1023711232


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala:
##########
@@ -85,15 +72,25 @@ object PhysicalOperation extends AliasHelper with PredicateHelper {
         // projects. We need to meet the following conditions to do so:
         //   1) no Project collected so far or the collected Projects are all deterministic
         //   2) the collected filters and this filter are all deterministic, or this is the
-        //      first collected filter.
+        //      first collected filter. This condition can be relaxed if `canKeepMultipleFilters` is
+        //      true.
         //   3) this filter does not repeat any expensive expressions from the collected
         //      projects.
-        val canIncludeThisFilter = fields.forall(_.forall(_.deterministic)) && {
-          filters.isEmpty || (filters.forall(_.deterministic) && condition.deterministic)
-        } && canCollapseExpressions(Seq(condition), aliases, alwaysInline)
-        if (canIncludeThisFilter) {
-          val replaced = replaceAlias(condition, aliases)
-          (fields, filters ++ splitConjunctivePredicates(replaced), other, aliases)
+        val canPushFilterThroughProject = fields.forall(_.forall(_.deterministic)) &&
+          canCollapseExpressions(Seq(condition), aliases, alwaysInline)
+        if (canPushFilterThroughProject) {
+          val canIncludeThisFilter = filters.isEmpty || {
+            filters.length == 1 && filters.head.forall(_.deterministic) && condition.deterministic
+          }

Review Comment:
   Previously, this is `filters.forall(_.deterministic)`, why it is relaxed here too? I think it is not under `canKeepMultipleFilters` condition below.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org