You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@drill.apache.org by "Adam Gilmore (JIRA)" <ji...@apache.org> on 2015/01/16 10:26:34 UTC

[jira] [Updated] (DRILL-2022) Parquet engine falls back to "new" Parquet reader unnecessarily

     [ https://issues.apache.org/jira/browse/DRILL-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adam Gilmore updated DRILL-2022:
--------------------------------
    Attachment: DRILL-2022.1.patch.txt

I've attached a possible patch to resolve this.  It basically just checks if it's a star query, in which case it falls back to the old method of detecting if it's complex, otherwise it checks each column to see if it's a primitive type.

> Parquet engine falls back to "new" Parquet reader unnecessarily
> ---------------------------------------------------------------
>
>                 Key: DRILL-2022
>                 URL: https://issues.apache.org/jira/browse/DRILL-2022
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Parquet
>    Affects Versions: 0.8.0
>            Reporter: Adam Gilmore
>            Assignee: Parth Chandra
>         Attachments: DRILL-2022.1.patch.txt
>
>
> The Parquet engine falls back to the "new" Parquet reader whenever a Parquet file that is "complex" (i.e. not purely primitive types) is found.
> The engine should still use the faster reader when all the projected columns are primitive types and only fall back to the other reader when columns containing complex types are selected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)