You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/07/17 22:21:20 UTC

[jira] [Commented] (SPARK-16596) Refactor DataSourceScanExec to do partition discovery at execution instead of planning time

    [ https://issues.apache.org/jira/browse/SPARK-16596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381553#comment-15381553 ] 

Apache Spark commented on SPARK-16596:
--------------------------------------

User 'ericl' has created a pull request for this issue:
https://github.com/apache/spark/pull/14241

> Refactor DataSourceScanExec to do partition discovery at execution instead of planning time
> -------------------------------------------------------------------------------------------
>
>                 Key: SPARK-16596
>                 URL: https://issues.apache.org/jira/browse/SPARK-16596
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Eric Liang
>            Priority: Minor
>
> Partition discovery is rather expensive, so we should do it at execution time instead of during physical planning. Right now there is not much benefit since ListingFileCatalog will read scan for all partitions at planning time anyways, but this can be optimized in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org