You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2015/10/18 22:05:05 UTC

[jira] [Created] (DRILL-3948) Partitioning columns of a Parquet table should be made visible to end user

Aman Sinha created DRILL-3948:
---------------------------------

             Summary: Partitioning columns of a Parquet table should be made visible to end user
                 Key: DRILL-3948
                 URL: https://issues.apache.org/jira/browse/DRILL-3948
             Project: Apache Drill
          Issue Type: Improvement
          Components: Metadata, Query Planning & Optimization
    Affects Versions: 1.2.0
            Reporter: Aman Sinha


For Parquet files, Drill can do partition pruning for filter conditions on a column which satisfies the following criteria: 
  Each parquet file has a single value of that column. The parquet metadata is examined for the min and max values of that column and if they are the same, the column is considered a partitioning column. 

  When CTAS auto-partition is used, the above criteria is enforced, but even for files created through external methods could satisfy the criteria.  

It is difficult for users to know what exactly are the candidate partitioning columns in the table.  We should provide this information in a user friendly way:  for instance: 
  - special  'show partition columns for <table>'  command
  - In the Explain plan, show partition columns for the table in Scan node
 More options should be discussed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)