You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Edward Capriolo (JIRA)" <ji...@apache.org> on 2013/12/06 20:49:38 UTC

[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

    [ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841601#comment-13841601 ] 

Edward Capriolo commented on HIVE-5783:
---------------------------------------

Why does support need to be build directly into the semantic analyzer? I think input format/serde's should be decoupled from the hive code as much as possible. hard codes like this make it hard to evolve support. I *think* you should be only adding the libs as a dependency to the pom files and building some tests. 

> Native Parquet Support in Hive
> ------------------------------
>
>                 Key: HIVE-5783
>                 URL: https://issues.apache.org/jira/browse/HIVE-5783
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Justin Coffey
>            Assignee: Justin Coffey
>            Priority: Minor
>             Fix For: 0.11.0
>
>         Attachments: hive-0.11-parquet.patch
>
>
> Problem Statement:
> Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive.
> About Parquet:
> Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration.
> Changes Details:
> Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1#6144)