You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Reynold Xin (JIRA)" <ji...@apache.org> on 2016/10/11 00:58:20 UTC

[jira] [Created] (SPARK-17861) Push data source partitions into metastore for catalog tables

Reynold Xin created SPARK-17861:
-----------------------------------

             Summary: Push data source partitions into metastore for catalog tables
                 Key: SPARK-17861
                 URL: https://issues.apache.org/jira/browse/SPARK-17861
             Project: Spark
          Issue Type: Improvement
          Components: SQL
            Reporter: Reynold Xin


Initially, Spark SQL does not store any partition information in the catalog for data source tables, because initially it was designed to work with arbitrary files. This, however, has a few issues for catalog tables:

1. Listing partitions for a large table (with millions of partitions) can be very slow during cold start.
2. Does not support heterogeneous partition naming schemes.
3. Cannot leverage pushing partition pruning into the metastore.

This ticket tracks the work required to push the tracking of partitions into the metastore.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org