You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by ghoto <gi...@git.apache.org> on 2018/05/16 00:24:02 UTC

[GitHub] spark pull request #21086: [SPARK-24002] [SQL] Task not serializable caused ...

Github user ghoto commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21086#discussion_r188473831
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala ---
    @@ -351,12 +338,26 @@ class ParquetFileFormat
         val timestampConversion: Boolean =
           sparkSession.sessionState.conf.isParquetINT96TimestampConversion
         val capacity = sqlConf.parquetVectorizedReaderBatchSize
    +    val enableParquetFilterPushDown: Boolean =
    +      sparkSession.sessionState.conf.parquetFilterPushDown
         // Whole stage codegen (PhysicalRDD) is able to deal with batches directly
         val returningBatch = supportBatch(sparkSession, resultSchema)
     
         (file: PartitionedFile) => {
           assert(file.partitionValues.numFields == partitionSchema.size)
     
    +      // Try to push down filters when filter push-down is enabled.
    --- End diff --
    
    So this code is the same as before. How can this solve the bug described in the head of the Conversation?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org