You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by rx...@apache.org on 2015/04/03 01:15:36 UTC
spark git commit: [Minor] [SQL] Follow-up of PR #5210

Repository: spark
Updated Branches:
  refs/heads/master 251698fb7 -> d3944b6f2


[Minor] [SQL] Follow-up of PR #5210

This PR addresses rxin's comments in PR #5210.

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/5219)
<!-- Reviewable:end -->

Author: Cheng Lian <li...@databricks.com>

Closes #5219 from liancheng/spark-6554-followup and squashes the following commits:

41f3a09 [Cheng Lian] Addresses comments in #5210


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d3944b6f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d3944b6f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d3944b6f

Branch: refs/heads/master
Commit: d3944b6f2aeb36629bf89207629cc5e55d327241
Parents: 251698f
Author: Cheng Lian <li...@databricks.com>
Authored: Thu Apr 2 16:15:34 2015 -0700
Committer: Reynold Xin <rx...@databricks.com>
Committed: Thu Apr 2 16:15:34 2015 -0700

----------------------------------------------------------------------
 .../scala/org/apache/spark/sql/parquet/newParquet.scala     | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/d3944b6f/sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala
----------------------------------------------------------------------
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala b/sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala
index e125314..583bac4 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala
@@ -434,17 +434,18 @@ private[sql] case class ParquetRelation2(
       FileInputFormat.setInputPaths(job, selectedFiles.map(_.getPath): _*)
     }
 
-    // Push down filters when possible. Notice that not all filters can be converted to Parquet
-    // filter predicate. Here we try to convert each individual predicate and only collect those
-    // convertible ones.
+    // Try to push down filters when filter push-down is enabled.
     if (sqlContext.conf.parquetFilterPushDown) {
+      val partitionColNames = partitionColumns.map(_.name).toSet
       predicates
         // Don't push down predicates which reference partition columns
         .filter { pred =>
-          val partitionColNames = partitionColumns.map(_.name).toSet
           val referencedColNames = pred.references.map(_.name).toSet
           referencedColNames.intersect(partitionColNames).isEmpty
         }
+        // Collects all converted Parquet filter predicates. Notice that not all predicates can be
+        // converted (`ParquetFilters.createFilter` returns an `Option`). That's why a `flatMap`
+        // is used here.
         .flatMap(ParquetFilters.createFilter)
         .reduceOption(FilterApi.and)
         .foreach(ParquetInputFormat.setFilterPredicate(jobConf, _))


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org