You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "wangyum (via GitHub)" <gi...@apache.org> on 2023/05/10 01:57:15 UTC

[GitHub] [spark] wangyum commented on a diff in pull request #41088: [SPARK-43402][SQL] FileSourceScanExec supports push down data filter with scalar subquery

wangyum commented on code in PR #41088:
URL: https://github.com/apache/spark/pull/41088#discussion_r1189280141


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala:
##########
@@ -543,7 +561,7 @@ case class FileSourceScanExec(
         dataSchema = relation.dataSchema,
         partitionSchema = relation.partitionSchema,
         requiredSchema = requiredSchema,
-        filters = pushedDownFilters,
+        filters = dynamicallyPushedDownFilters,

Review Comment:
   Could we split the `dynamicallyPushedDownFilters` to `pushedDownFilters` and `pushedDownRuntimeFilters` and combine them here? Something like this: https://github.com/apache/spark/pull/36128/files#diff-089285f1484c1598cb2839b86b6a9e65b98ab5b30462aedc210fe4bbf44cae78R374-R455
   This is because `pushedDownFilters` are also used in [`metadata`](https://github.com/apache/spark/blob/abd841bdcabc64d04e0743b0c16443cf0bb66558/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L416), while `pushedDownRuntimeFilters` can only be used here.
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org