You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Haifeng Chen (Jira)" <ji...@apache.org> on 2020/01/06 03:07:00 UTC
[jira] [Created] (SPARK-30425) FileScan of Data Source V2 doesn't
implement Partition Pruning
Haifeng Chen created SPARK-30425:
------------------------------------
Summary: FileScan of Data Source V2 doesn't implement Partition Pruning
Key: SPARK-30425
URL: https://issues.apache.org/jira/browse/SPARK-30425
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.0.0
Reporter: Haifeng Chen
I was trying to understand how Data Source V2 handling partition pruning, I didn't find the code anywhere which filtering out the unnecessary files in current Data Source V2 implementation. For a File data source, the base class FileScan of Data Source V2 possibly should handle this in "partitions" method. But the current implementation is like the following:
protected def partitions: Seq[FilePartition] = {
val selectedPartitions = fileIndex.listFiles(Seq.empty, Seq.empty)
listFiles passed to empty sequence where no files will be filtered by the partition filter.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org