You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/03/07 18:34:49 UTC

[GitHub] [spark] LuciferYang commented on a change in pull request #35669: [SPARK-38041][SQL] DataFilter pushed down with PartitionFilter

LuciferYang commented on a change in pull request #35669:
URL: https://github.com/apache/spark/pull/35669#discussion_r820991651



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala
##########
@@ -630,4 +631,35 @@ object PartitioningUtils extends SQLConfHelper{
     case (DoubleType, LongType) | (LongType, DoubleType) => StringType
     case (t1, t2) => TypeCoercion.findWiderTypeForTwo(t1, t2).getOrElse(StringType)
   }
+
+  /**
+   * Evaluating partition filter for datasource scan at runtime.
+   * This is mainly used to evaluate partition conditions during task runtime and
+   * complete further pruning of push-down dataFilter, which can reduce datasource scan IO.
+   * @param partitionSchema schema of partition.
+   * @param partitionValues partition values.
+   * @param partitionFilter partition filter.
+   * @return the result of evaluate result.
+   */
+  def evaluatePartitionFilter(
+      partitionSchema: StructType,
+      partitionValues: Option[InternalRow],
+      partitionFilter: sources.Filter): Boolean = {
+    val predicate =
+      StructFilters.filterToExpression(partitionFilter, StructFilters.toRef(partitionSchema))
+    if (predicate.isDefined) {
+      val boundPredicate = Predicate.createInterpreted(predicate.get.transform {

Review comment:
       Maybe we can extract line651 ~ 655 into a new function, then return 
   ```scala
   ...
   predicate.isDefined && partitionValues.exists(newFunc.eval)
   ``` 
   ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org