You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/08/03 14:19:14 UTC

[GitHub] [spark] cloud-fan commented on a diff in pull request #42318: [SPARK-44655][SQL] Make the code cleaner about static and dynamic data/partition filters

cloud-fan commented on code in PR #42318:
URL: https://github.com/apache/spark/pull/42318#discussion_r1283276807


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala:
##########
@@ -249,12 +249,18 @@ trait FileSourceScanLike extends DataSourceScanExec {
   private def isDynamicPruningFilter(e: Expression): Boolean =
     e.exists(_.isInstanceOf[PlanExpression[_]])
 
+
+  // This field will be accessed during planning (e.g., `outputPartitioning` relies on it), and can
+  // only use static filters.
   @transient lazy val selectedPartitions: Array[PartitionDirectory] = {

Review Comment:
   This is a public field of `FileSourceScanLike`. Some external Spark extensions may access it during planning. So I make it safe to be executed during planning, instead of ensuring it to be only executed after planning.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org