You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Sergio Pena <se...@cloudera.com> on 2017/07/11 20:13:33 UTC

Re: Review Request 59020: Support Parquet through HCatalog

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59020/#review180234
-----------------------------------------------------------


Fix it, then Ship it!




The patch looks good Adam. Please just add the comments and remove the unused imports.


hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorerMulti.java
Lines 46-49 (original), 46-49 (patched)
<https://reviews.apache.org/r/59020/#comment255324>

    There are a few unused imports. Remove them.



hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestParquetHCatStorer.java
Lines 21-28 (original), 21-28 (patched)
<https://reviews.apache.org/r/59020/#comment255323>

    There are a few unused imports. Remove them.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
Line 76 (original), 76 (patched)
<https://reviews.apache.org/r/59020/#comment255325>

    Could you add a javadoc comment about what parameters are expected on the JobConf object? 
    
    I see that this constructor won't work unless the PARQUET_HIVE_SCHEMA is set by the caller (HCat is doing in on the SpecialCases), and we have to leave that comment about the assumptions.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java
Lines 24 (patched)
<https://reviews.apache.org/r/59020/#comment255309>

    I like to avoid short words on method names. What bout just using isParquetProperty() instead? The class name already implies this is meant for parquet tables.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
Lines 87 (patched)
<https://reviews.apache.org/r/59020/#comment255312>

    Same thing with the short words on method names. 
    
    However, makes more sense to use getParquetProperties(), what do you think? Table properties are set through the Table, but jobConf can bring other parquet properties that were set in hive-site.xml, right?


- Sergio Pena


On May 5, 2017, 8:11 a.m., Adam Szita wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59020/
> -----------------------------------------------------------
> 
> (Updated May 5, 2017, 8:11 a.m.)
> 
> 
> Review request for hive, Aihua Xu and Sergio Pena.
> 
> 
> Bugs: HIVE-8838
>     https://issues.apache.org/jira/browse/HIVE-8838
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Adding support for HCatalog to write tables stored in Parquet format
> 
> 
> Diffs
> -----
> 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java b2abc5fbb3670893415354552239d67d072459ed 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/SpecialCases.java 60af5c0bf397273fb820f0ee31e578745dbc200f 
>   hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoaderComplexSchema.java 4c686fec596d39d41d458bc3ea2753877bd9df98 
>   hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoaderEncryption.java ad11eab1b7e67541b56e90e4a85ba37b41a4db92 
>   hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorerMulti.java 918332ddfda58306707d326f8668b2c223110a29 
>   hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestParquetHCatLoader.java 6cd382145b55d6b85fc3366faeaba2aaef65ab04 
>   hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestParquetHCatStorer.java 6dfdc04954dd0b110b1a7194e69468b5dc2f842e 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java a7bb5eedbb99f3cea4601b9fce9a0ad3461567d0 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetTableUtils.java b339cc4347eea143dca2f6d98f9aa7777afdc427 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java 71a78cf040667bf14b6c720373e4acd102da19f4 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java c021dafa480e65d7c0c19a5a85988464112468cb 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java ec85b5df0f95cbd45b87259346ae9c1e5aa604a4 
> 
> 
> Diff: https://reviews.apache.org/r/59020/diff/1/
> 
> 
> Testing
> -------
> 
> Tested on cluster, and re-enabled previously disabled tests in HCatalog (for Parquet) that were failing (this adds ~40 tests to be run)
> 
> 
> Thanks,
> 
> Adam Szita
> 
>