You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/02/21 11:07:38 UTC

[GitHub] nandorKollar commented on a change in pull request #23855: [SPARK-26930][SQL] Tests in ParquetFilterSuite don't verify filter class

nandorKollar commented on a change in pull request #23855: [SPARK-26930][SQL] Tests in ParquetFilterSuite don't verify filter class
URL: https://github.com/apache/spark/pull/23855#discussion_r258880928
 
 

 ##########
 File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
 ##########
 @@ -114,12 +114,24 @@ class ParquetFilterSuite extends QueryTest with ParquetTest with SharedSQLContex
             new SparkToParquetSchemaConverter(conf).convert(df.schema), pred)
           assert(maybeFilter.isDefined, s"Couldn't generate filter predicate for $pred")
           // Doesn't bother checking type parameters here (e.g. `Eq[Integer]`)
-          maybeFilter.exists(_.getClass === filterClass)
+          assert(flattenPredicateTree(maybeFilter.get).exists(_.getClass === filterClass))
         }
         checker(stripSparkFilter(query), expected)
     }
   }
 
+  private def flattenPredicateTree(filterPredicate: FilterPredicate): Seq[Object] = {
 
 Review comment:
   Ok, we can do this too, I like this idea. Actually after flattening the tree, it is easy to check for the entire predicate being pushed to Parquet: the tests should pass all predicates as a flattened tree as expected value in a sequence, and compare this with the flattened actual predicate tree. Most of the cases we are interested in only one predicate, so null filter rule just produces some noise in this case, however in cases like [this](https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala#L182) it might provide some more value to check for the entire predicate tree, not just for the presence of 'or'.
   
   Anyway, this requires a lot more change though, I'm fine with your proposed solution!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org