You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/03/31 21:19:00 UTC

[GitHub] [spark] dbtsai commented on issue #27155: [SPARK-17636][SPARK-25557][SQL] Parquet and ORC predicate pushdown in nested fields

dbtsai commented on issue #27155: [SPARK-17636][SPARK-25557][SQL] Parquet and ORC predicate pushdown in nested fields
URL: https://github.com/apache/spark/pull/27155#issuecomment-606881584
 
 
   In this PR, you also use `dots` to create the source filter api. This doesn't handle column name containing `dots` by quoting it properly. As we have proper parser to parse mutipart identifier that is proven and used everywhere, it's much more easy to use `dots` in source filter apis.
   
   The implementation of each data source can be different. I choose to use key as a string containing `dots` in parquet for simplicity. But you can always do the schema stuff.
   
   ```
     private def translateLeafNodeFilter(predicate: Expression): Option[Filter] = {
       // Recursively try to find an attribute name from the top level that can be pushed down.
       def attrName(e: Expression): Option[String] = e match {
         case a: Attribute if a.dataType != StructType =>
           Some(a.name)
         case s: GetStructField if s.childSchema(s.ordinal).dataType != StructType =>
           attrName(s.child).map(_ + s".${s.childSchema(s.ordinal).name}")
         case _ =>
           None
       }
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org