You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2016/05/27 05:00:17 UTC

[jira] [Commented] (HIVE-13873) Column pruning for nested fields

    [ https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303487#comment-15303487 ] 

Xuefu Zhang commented on HIVE-13873:
------------------------------------

FYI, [~Ferd]

> Column pruning for nested fields
> --------------------------------
>
>                 Key: HIVE-13873
>                 URL: https://issues.apache.org/jira/browse/HIVE-13873
>             Project: Hive
>          Issue Type: New Feature
>          Components: Logical Optimizer
>            Reporter: Xuefu Zhang
>
> Some columnar file formats such as Parquet store fields in struct type also column by column using encoding described in Google Dramel pager. It's very common in big data where data are stored in structs while queries only needs a subset of the the fields in the structs. However, presently Hive still needs to read the whole struct regardless whether all fields are selected. Therefore, pruning unwanted sub-fields in struct or nested fields at file reading time would be a big performance boost for such scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)