You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@drill.apache.org by "Jacques Nadeau (JIRA)" <ji...@apache.org> on 2014/12/29 00:31:20 UTC

[jira] [Updated] (DRILL-1858) Parquet reader should only explicitly fill in data for a column requested but not in the file if there are no valid columns found

     [ https://issues.apache.org/jira/browse/DRILL-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jacques Nadeau updated DRILL-1858:
----------------------------------
    Fix Version/s: Future

> Parquet reader should only explicitly fill in data for a column requested but not in the file if there are no valid columns found
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-1858
>                 URL: https://issues.apache.org/jira/browse/DRILL-1858
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Jason Altekruse
>             Fix For: Future
>
>
> If columns are requested from a parquet file, that do not appear in the particular file (users may have a directory full of files that share some columns but not others) then we do not need to create a vector to represent these columns in most cases. These columns can be materialized (as a vector filled with nulls) later when they are referenced in other parts of the query, such as a filter or join condition. The current behavior of the reader is to always fill vectors for these types of columns, but this just creates extra payload to ship around until the vectors are actually referenced.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)