You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2015/10/18 22:05:05 UTC
[jira] [Created] (DRILL-3948) Partitioning columns of a Parquet
table should be made visible to end user
Aman Sinha created DRILL-3948:
---------------------------------
Summary: Partitioning columns of a Parquet table should be made visible to end user
Key: DRILL-3948
URL: https://issues.apache.org/jira/browse/DRILL-3948
Project: Apache Drill
Issue Type: Improvement
Components: Metadata, Query Planning & Optimization
Affects Versions: 1.2.0
Reporter: Aman Sinha
For Parquet files, Drill can do partition pruning for filter conditions on a column which satisfies the following criteria:
Each parquet file has a single value of that column. The parquet metadata is examined for the min and max values of that column and if they are the same, the column is considered a partitioning column.
When CTAS auto-partition is used, the above criteria is enforced, but even for files created through external methods could satisfy the criteria.
It is difficult for users to know what exactly are the candidate partitioning columns in the table. We should provide this information in a user friendly way: for instance:
- special 'show partition columns for <table>' command
- In the Explain plan, show partition columns for the table in Scan node
More options should be discussed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)