You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Dong Chen <do...@intel.com> on 2015/02/06 08:51:14 UTC

Review Request 30717: HIVE-8119: Implement Date in ParquetSerde

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30717/
-----------------------------------------------------------

Review request for hive.


Repository: hive-git


Description
-------

HIVE-8119: Implement Date in ParquetSerde

This patch map the Date in Hive to INT32 in Parquet, based on the Parquet Logical Type Definitions in https://github.com/apache/incubator-parquet-format/blob/master/LogicalTypes.md


Diffs
-----

  data/files/parquet_types.txt 31a10c9 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 377e362 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java e5bd70c 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java bb066af 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9199127 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 1d83bf3 
  ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java f232c57 
  ql/src/test/queries/clientnegative/parquet_date.q 89d3602 
  ql/src/test/queries/clientpositive/parquet_types.q 806db24 
  ql/src/test/results/clientnegative/parquet_date.q.out d1c38d6 
  ql/src/test/results/clientpositive/parquet_types.q.out dc5ceb0 

Diff: https://reviews.apache.org/r/30717/diff/


Testing
-------

UT passed. 2 tests are added


Thanks,

Dong Chen


Re: Review Request 30717: HIVE-8119: Implement Date in ParquetSerde

Posted by Sergio Pena <se...@cloudera.com>.

> On Feb. 6, 2015, 4:01 p.m., Sergio Pena wrote:
> > Ship It!

Looks good Dong. :)


- Sergio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30717/#review71442
-----------------------------------------------------------


On Feb. 6, 2015, 7:51 a.m., Dong Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30717/
> -----------------------------------------------------------
> 
> (Updated Feb. 6, 2015, 7:51 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8119: Implement Date in ParquetSerde
> 
> This patch map the Date in Hive to INT32 in Parquet, based on the Parquet Logical Type Definitions in https://github.com/apache/incubator-parquet-format/blob/master/LogicalTypes.md
> 
> 
> Diffs
> -----
> 
>   data/files/parquet_types.txt 31a10c9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 377e362 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java e5bd70c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java bb066af 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9199127 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 1d83bf3 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java f232c57 
>   ql/src/test/queries/clientnegative/parquet_date.q 89d3602 
>   ql/src/test/queries/clientpositive/parquet_types.q 806db24 
>   ql/src/test/results/clientnegative/parquet_date.q.out d1c38d6 
>   ql/src/test/results/clientpositive/parquet_types.q.out dc5ceb0 
> 
> Diff: https://reviews.apache.org/r/30717/diff/
> 
> 
> Testing
> -------
> 
> UT passed. 2 tests are added
> 
> 
> Thanks,
> 
> Dong Chen
> 
>


Re: Review Request 30717: HIVE-8119: Implement Date in ParquetSerde

Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30717/#review71442
-----------------------------------------------------------

Ship it!


Ship It!

- Sergio Pena


On Feb. 6, 2015, 7:51 a.m., Dong Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30717/
> -----------------------------------------------------------
> 
> (Updated Feb. 6, 2015, 7:51 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8119: Implement Date in ParquetSerde
> 
> This patch map the Date in Hive to INT32 in Parquet, based on the Parquet Logical Type Definitions in https://github.com/apache/incubator-parquet-format/blob/master/LogicalTypes.md
> 
> 
> Diffs
> -----
> 
>   data/files/parquet_types.txt 31a10c9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 377e362 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java e5bd70c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java bb066af 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9199127 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 1d83bf3 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java f232c57 
>   ql/src/test/queries/clientnegative/parquet_date.q 89d3602 
>   ql/src/test/queries/clientpositive/parquet_types.q 806db24 
>   ql/src/test/results/clientnegative/parquet_date.q.out d1c38d6 
>   ql/src/test/results/clientpositive/parquet_types.q.out dc5ceb0 
> 
> Diff: https://reviews.apache.org/r/30717/diff/
> 
> 
> Testing
> -------
> 
> UT passed. 2 tests are added
> 
> 
> Thanks,
> 
> Dong Chen
> 
>


Re: Review Request 30717: HIVE-8119: Implement Date in ParquetSerde

Posted by Brock Noland <br...@cloudera.com>.

> On Feb. 10, 2015, 7:50 p.m., Ryan Blue wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java, line 99
> > <https://reviews.apache.org/r/30717/diff/1/?file=852070#file852070line99>
> >
> >     For primitive types, this should be using the Types API (like the line above) because we're going to remove the constructors from the public API in favor of the bulider. This is to avoid invalid types, like an INT64 with a DATE annotation.
> >     
> >     This should be:
> >     ```java
> >     Types.primitive(repetition, INT32).as(DATE).named(name);
> >     ```
> 
> Sergio Pena wrote:
>     Agree. Should we follow this new API in another JIRA so that we cover all primitive types?.

Yes let's do this in a follow on.


- Brock


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30717/#review71844
-----------------------------------------------------------


On Feb. 6, 2015, 7:51 a.m., Dong Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30717/
> -----------------------------------------------------------
> 
> (Updated Feb. 6, 2015, 7:51 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8119: Implement Date in ParquetSerde
> 
> This patch map the Date in Hive to INT32 in Parquet, based on the Parquet Logical Type Definitions in https://github.com/apache/incubator-parquet-format/blob/master/LogicalTypes.md
> 
> 
> Diffs
> -----
> 
>   data/files/parquet_types.txt 31a10c9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 377e362 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java e5bd70c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java bb066af 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9199127 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 1d83bf3 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java f232c57 
>   ql/src/test/queries/clientnegative/parquet_date.q 89d3602 
>   ql/src/test/queries/clientpositive/parquet_types.q 806db24 
>   ql/src/test/results/clientnegative/parquet_date.q.out d1c38d6 
>   ql/src/test/results/clientpositive/parquet_types.q.out dc5ceb0 
> 
> Diff: https://reviews.apache.org/r/30717/diff/
> 
> 
> Testing
> -------
> 
> UT passed. 2 tests are added
> 
> 
> Thanks,
> 
> Dong Chen
> 
>


Re: Review Request 30717: HIVE-8119: Implement Date in ParquetSerde

Posted by Sergio Pena <se...@cloudera.com>.

> On Feb. 10, 2015, 7:50 p.m., Ryan Blue wrote:
> > One minor thing, but this looks good otherwise. Sergio knows more about the compatibility between this and his recent performance work and the object inspector code, so I'll leave that to him for review.

This is good for now. We're working in the performance work in another branch for the moment. This new datatype is going to be on trunk.


> On Feb. 10, 2015, 7:50 p.m., Ryan Blue wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java, line 99
> > <https://reviews.apache.org/r/30717/diff/1/?file=852070#file852070line99>
> >
> >     For primitive types, this should be using the Types API (like the line above) because we're going to remove the constructors from the public API in favor of the bulider. This is to avoid invalid types, like an INT64 with a DATE annotation.
> >     
> >     This should be:
> >     ```java
> >     Types.primitive(repetition, INT32).as(DATE).named(name);
> >     ```

Agree. Should we follow this new API in another JIRA so that we cover all primitive types?.


- Sergio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30717/#review71844
-----------------------------------------------------------


On Feb. 6, 2015, 7:51 a.m., Dong Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30717/
> -----------------------------------------------------------
> 
> (Updated Feb. 6, 2015, 7:51 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8119: Implement Date in ParquetSerde
> 
> This patch map the Date in Hive to INT32 in Parquet, based on the Parquet Logical Type Definitions in https://github.com/apache/incubator-parquet-format/blob/master/LogicalTypes.md
> 
> 
> Diffs
> -----
> 
>   data/files/parquet_types.txt 31a10c9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 377e362 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java e5bd70c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java bb066af 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9199127 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 1d83bf3 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java f232c57 
>   ql/src/test/queries/clientnegative/parquet_date.q 89d3602 
>   ql/src/test/queries/clientpositive/parquet_types.q 806db24 
>   ql/src/test/results/clientnegative/parquet_date.q.out d1c38d6 
>   ql/src/test/results/clientpositive/parquet_types.q.out dc5ceb0 
> 
> Diff: https://reviews.apache.org/r/30717/diff/
> 
> 
> Testing
> -------
> 
> UT passed. 2 tests are added
> 
> 
> Thanks,
> 
> Dong Chen
> 
>


Re: Review Request 30717: HIVE-8119: Implement Date in ParquetSerde

Posted by Dong Chen <do...@intel.com>.

> On Feb. 10, 2015, 7:50 p.m., Ryan Blue wrote:
> > One minor thing, but this looks good otherwise. Sergio knows more about the compatibility between this and his recent performance work and the object inspector code, so I'll leave that to him for review.
> 
> Sergio Pena wrote:
>     This is good for now. We're working in the performance work in another branch for the moment. This new datatype is going to be on trunk.

Hi, Ryan, Sergio,

Thanks for your review! I agree it is ok for now, and will keep an eye on the compatibility issue then.


- Dong


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30717/#review71844
-----------------------------------------------------------


On Feb. 6, 2015, 7:51 a.m., Dong Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30717/
> -----------------------------------------------------------
> 
> (Updated Feb. 6, 2015, 7:51 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8119: Implement Date in ParquetSerde
> 
> This patch map the Date in Hive to INT32 in Parquet, based on the Parquet Logical Type Definitions in https://github.com/apache/incubator-parquet-format/blob/master/LogicalTypes.md
> 
> 
> Diffs
> -----
> 
>   data/files/parquet_types.txt 31a10c9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 377e362 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java e5bd70c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java bb066af 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9199127 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 1d83bf3 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java f232c57 
>   ql/src/test/queries/clientnegative/parquet_date.q 89d3602 
>   ql/src/test/queries/clientpositive/parquet_types.q 806db24 
>   ql/src/test/results/clientnegative/parquet_date.q.out d1c38d6 
>   ql/src/test/results/clientpositive/parquet_types.q.out dc5ceb0 
> 
> Diff: https://reviews.apache.org/r/30717/diff/
> 
> 
> Testing
> -------
> 
> UT passed. 2 tests are added
> 
> 
> Thanks,
> 
> Dong Chen
> 
>


Re: Review Request 30717: HIVE-8119: Implement Date in ParquetSerde

Posted by Ryan Blue <bl...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30717/#review71844
-----------------------------------------------------------


One minor thing, but this looks good otherwise. Sergio knows more about the compatibility between this and his recent performance work and the object inspector code, so I'll leave that to him for review.


ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java
<https://reviews.apache.org/r/30717/#comment117710>

    For primitive types, this should be using the Types API (like the line above) because we're going to remove the constructors from the public API in favor of the bulider. This is to avoid invalid types, like an INT64 with a DATE annotation.
    
    This should be:
    ```java
    Types.primitive(repetition, INT32).as(DATE).named(name);
    ```


- Ryan Blue


On Feb. 5, 2015, 11:51 p.m., Dong Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30717/
> -----------------------------------------------------------
> 
> (Updated Feb. 5, 2015, 11:51 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8119: Implement Date in ParquetSerde
> 
> This patch map the Date in Hive to INT32 in Parquet, based on the Parquet Logical Type Definitions in https://github.com/apache/incubator-parquet-format/blob/master/LogicalTypes.md
> 
> 
> Diffs
> -----
> 
>   data/files/parquet_types.txt 31a10c9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 377e362 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java e5bd70c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java bb066af 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9199127 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 1d83bf3 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java f232c57 
>   ql/src/test/queries/clientnegative/parquet_date.q 89d3602 
>   ql/src/test/queries/clientpositive/parquet_types.q 806db24 
>   ql/src/test/results/clientnegative/parquet_date.q.out d1c38d6 
>   ql/src/test/results/clientpositive/parquet_types.q.out dc5ceb0 
> 
> Diff: https://reviews.apache.org/r/30717/diff/
> 
> 
> Testing
> -------
> 
> UT passed. 2 tests are added
> 
> 
> Thanks,
> 
> Dong Chen
> 
>


Re: Review Request 30717: HIVE-8119: Implement Date in ParquetSerde

Posted by Daoyuan Wang <da...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30717/#review75139
-----------------------------------------------------------

Ship it!


Ship It!

- Daoyuan Wang


On 二月 6, 2015, 7:51 a.m., Dong Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30717/
> -----------------------------------------------------------
> 
> (Updated 二月 6, 2015, 7:51 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8119: Implement Date in ParquetSerde
> 
> This patch map the Date in Hive to INT32 in Parquet, based on the Parquet Logical Type Definitions in https://github.com/apache/incubator-parquet-format/blob/master/LogicalTypes.md
> 
> 
> Diffs
> -----
> 
>   data/files/parquet_types.txt 31a10c9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 377e362 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java e5bd70c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java bb066af 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9199127 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 1d83bf3 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java f232c57 
>   ql/src/test/queries/clientnegative/parquet_date.q 89d3602 
>   ql/src/test/queries/clientpositive/parquet_types.q 806db24 
>   ql/src/test/results/clientnegative/parquet_date.q.out d1c38d6 
>   ql/src/test/results/clientpositive/parquet_types.q.out dc5ceb0 
> 
> Diff: https://reviews.apache.org/r/30717/diff/
> 
> 
> Testing
> -------
> 
> UT passed. 2 tests are added
> 
> 
> Thanks,
> 
> Dong Chen
> 
>


Re: Review Request 30717: HIVE-8119: Implement Date in ParquetSerde

Posted by Ryan Blue <bl...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30717/#review73933
-----------------------------------------------------------

Ship it!


Ship It!

- Ryan Blue


On Feb. 5, 2015, 11:51 p.m., Dong Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30717/
> -----------------------------------------------------------
> 
> (Updated Feb. 5, 2015, 11:51 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8119: Implement Date in ParquetSerde
> 
> This patch map the Date in Hive to INT32 in Parquet, based on the Parquet Logical Type Definitions in https://github.com/apache/incubator-parquet-format/blob/master/LogicalTypes.md
> 
> 
> Diffs
> -----
> 
>   data/files/parquet_types.txt 31a10c9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 377e362 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java e5bd70c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java bb066af 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9199127 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 1d83bf3 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java f232c57 
>   ql/src/test/queries/clientnegative/parquet_date.q 89d3602 
>   ql/src/test/queries/clientpositive/parquet_types.q 806db24 
>   ql/src/test/results/clientnegative/parquet_date.q.out d1c38d6 
>   ql/src/test/results/clientpositive/parquet_types.q.out dc5ceb0 
> 
> Diff: https://reviews.apache.org/r/30717/diff/
> 
> 
> Testing
> -------
> 
> UT passed. 2 tests are added
> 
> 
> Thanks,
> 
> Dong Chen
> 
>