You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "benj (Jira)" <ji...@apache.org> on 2019/10/03 12:20:00 UTC

[jira] [Created] (DRILL-7395) Partial Partition By to CTAS Parquet files

benj created DRILL-7395:
---------------------------

             Summary: Partial Partition By to CTAS Parquet files
                 Key: DRILL-7395
                 URL: https://issues.apache.org/jira/browse/DRILL-7395
             Project: Apache Drill
          Issue Type: Improvement
          Components: Storage - Parquet
    Affects Versions: 1.16.0
            Reporter: benj


In the case of a data set with few value are prevailing while most have weak occurrences, it will be useful to have the abilities to create Parquet with a partial _PARTITION BY_.

It would then be possible to group all the small occurrences together without being "impacted" by the "too" common values.

It's not exactly the same, but it exists partial index on some database (https://www.postgresql.org/docs/current/indexes-partial.html)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)