You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ghoto <gi...@git.apache.org> on 2018/05/16 00:24:02 UTC
[GitHub] spark pull request #21086: [SPARK-24002] [SQL] Task not serializable caused ...
Github user ghoto commented on a diff in the pull request:
https://github.com/apache/spark/pull/21086#discussion_r188473831
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala ---
@@ -351,12 +338,26 @@ class ParquetFileFormat
val timestampConversion: Boolean =
sparkSession.sessionState.conf.isParquetINT96TimestampConversion
val capacity = sqlConf.parquetVectorizedReaderBatchSize
+ val enableParquetFilterPushDown: Boolean =
+ sparkSession.sessionState.conf.parquetFilterPushDown
// Whole stage codegen (PhysicalRDD) is able to deal with batches directly
val returningBatch = supportBatch(sparkSession, resultSchema)
(file: PartitionedFile) => {
assert(file.partitionValues.numFields == partitionSchema.size)
+ // Try to push down filters when filter push-down is enabled.
--- End diff --
So this code is the same as before. How can this solve the bug described in the head of the Conversation?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org