You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Barna Zsombor Klara <zs...@cloudera.com> on 2017/03/17 14:57:46 UTC

Review Request 57728: HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57728/
-----------------------------------------------------------

Review request for hive and Sergio Pena.


Repository: hive-git


Description
-------

HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 26f1e75c7d659a634cd4eef3a0cb8e886b22722f 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 8e33b7d437894b33b35f32913a3bc02f2a849ce3 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 5dc808800290f3274afbdff12134ac34387a746b 
  ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 5de2c3f1244b8340b97eb0547fe66e52d80fb065 


Diff: https://reviews.apache.org/r/57728/diff/1/


Testing
-------

Tested loading timestamps from a parquet file written by spark.


Thanks,

Barna Zsombor Klara


Re: Review Request 57728: HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767

Posted by Sergio Pena <se...@cloudera.com>.

> On March 22, 2017, 6:27 p.m., Sergio Pena wrote:
> > common/src/java/org/apache/hive/common/util/DateUtils.java
> > Lines 84 (patched)
> > <https://reviews.apache.org/r/57728/diff/2/?file=1670971#file1670971line84>
> >
> >     Is there another class where to put this method? I don't think DateUtils is the place where we should keep this.
> 
> Barna Zsombor Klara wrote:
>     I couldn't find a much better fit. I looked at HiveUtils and ParquetTableUtils but DateUtils seemed better. I can create a TimeZoneUtils class, but I don't know if we will ever have a second function in it. Do you have a utility class in mind that would be better?

Well, NanoTimeUtils is not a good name either, but it has the getCalendar() public method that returns a calendar based on UTC or local timezone. Eventually, we should rename this method to TimeUtils.java or something, but I don't know if external apps are using it.


- Sergio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57728/#review169758
-----------------------------------------------------------


On March 21, 2017, 5:28 p.m., Barna Zsombor Klara wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57728/
> -----------------------------------------------------------
> 
> (Updated March 21, 2017, 5:28 p.m.)
> 
> 
> Review request for hive and Sergio Pena.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hive/common/util/DateUtils.java a1068ecce94e9ff1ae78008a0d8c6d67ca4f2690 
>   common/src/test/org/apache/hive/common/util/TestDateUtils.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 26f1e75c7d659a634cd4eef3a0cb8e886b22722f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 8e33b7d437894b33b35f32913a3bc02f2a849ce3 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 5dc808800290f3274afbdff12134ac34387a746b 
>   ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 5de2c3f1244b8340b97eb0547fe66e52d80fb065 
> 
> 
> Diff: https://reviews.apache.org/r/57728/diff/2/
> 
> 
> Testing
> -------
> 
> Tested loading timestamps from a parquet file written by spark.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>


Re: Review Request 57728: HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767

Posted by Barna Zsombor Klara <zs...@cloudera.com>.

> On March 22, 2017, 6:27 p.m., Sergio Pena wrote:
> > common/src/java/org/apache/hive/common/util/DateUtils.java
> > Lines 84 (patched)
> > <https://reviews.apache.org/r/57728/diff/2/?file=1670971#file1670971line84>
> >
> >     Is there another class where to put this method? I don't think DateUtils is the place where we should keep this.

I couldn't find a much better fit. I looked at HiveUtils and ParquetTableUtils but DateUtils seemed better. I can create a TimeZoneUtils class, but I don't know if we will ever have a second function in it. Do you have a utility class in mind that would be better?


- Barna Zsombor


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57728/#review169758
-----------------------------------------------------------


On March 21, 2017, 5:28 p.m., Barna Zsombor Klara wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57728/
> -----------------------------------------------------------
> 
> (Updated March 21, 2017, 5:28 p.m.)
> 
> 
> Review request for hive and Sergio Pena.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hive/common/util/DateUtils.java a1068ecce94e9ff1ae78008a0d8c6d67ca4f2690 
>   common/src/test/org/apache/hive/common/util/TestDateUtils.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 26f1e75c7d659a634cd4eef3a0cb8e886b22722f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 8e33b7d437894b33b35f32913a3bc02f2a849ce3 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 5dc808800290f3274afbdff12134ac34387a746b 
>   ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 5de2c3f1244b8340b97eb0547fe66e52d80fb065 
> 
> 
> Diff: https://reviews.apache.org/r/57728/diff/2/
> 
> 
> Testing
> -------
> 
> Tested loading timestamps from a parquet file written by spark.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>


Re: Review Request 57728: HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767

Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57728/#review169758
-----------------------------------------------------------




common/src/java/org/apache/hive/common/util/DateUtils.java
Lines 84 (patched)
<https://reviews.apache.org/r/57728/#comment242327>

    Is there another class where to put this method? I don't think DateUtils is the place where we should keep this.


- Sergio Pena


On March 21, 2017, 5:28 p.m., Barna Zsombor Klara wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57728/
> -----------------------------------------------------------
> 
> (Updated March 21, 2017, 5:28 p.m.)
> 
> 
> Review request for hive and Sergio Pena.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hive/common/util/DateUtils.java a1068ecce94e9ff1ae78008a0d8c6d67ca4f2690 
>   common/src/test/org/apache/hive/common/util/TestDateUtils.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 26f1e75c7d659a634cd4eef3a0cb8e886b22722f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 8e33b7d437894b33b35f32913a3bc02f2a849ce3 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 5dc808800290f3274afbdff12134ac34387a746b 
>   ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 5de2c3f1244b8340b97eb0547fe66e52d80fb065 
> 
> 
> Diff: https://reviews.apache.org/r/57728/diff/2/
> 
> 
> Testing
> -------
> 
> Tested loading timestamps from a parquet file written by spark.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>


Re: Review Request 57728: HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767

Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57728/#review170018
-----------------------------------------------------------


Ship it!




Just remove the empty line from DateUtils. 
+1

- Sergio Pena


On March 24, 2017, 9:56 a.m., Barna Zsombor Klara wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57728/
> -----------------------------------------------------------
> 
> (Updated March 24, 2017, 9:56 a.m.)
> 
> 
> Review request for hive and Sergio Pena.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hive/common/util/DateUtils.java a1068ecce94e9ff1ae78008a0d8c6d67ca4f2690 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 26f1e75c7d659a634cd4eef3a0cb8e886b22722f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 8e33b7d437894b33b35f32913a3bc02f2a849ce3 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 5dc808800290f3274afbdff12134ac34387a746b 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java 37cf0e2d74589cfa97fa24c9d2d8d00ea62390ee 
>   ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 5de2c3f1244b8340b97eb0547fe66e52d80fb065 
> 
> 
> Diff: https://reviews.apache.org/r/57728/diff/3/
> 
> 
> Testing
> -------
> 
> Tested loading timestamps from a parquet file written by spark.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>


Re: Review Request 57728: HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767

Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57728/#review170017
-----------------------------------------------------------




common/src/java/org/apache/hive/common/util/DateUtils.java
Lines 77 (patched)
<https://reviews.apache.org/r/57728/#comment242748>

    Empty line.


- Sergio Pena


On March 24, 2017, 9:56 a.m., Barna Zsombor Klara wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57728/
> -----------------------------------------------------------
> 
> (Updated March 24, 2017, 9:56 a.m.)
> 
> 
> Review request for hive and Sergio Pena.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hive/common/util/DateUtils.java a1068ecce94e9ff1ae78008a0d8c6d67ca4f2690 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 26f1e75c7d659a634cd4eef3a0cb8e886b22722f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 8e33b7d437894b33b35f32913a3bc02f2a849ce3 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 5dc808800290f3274afbdff12134ac34387a746b 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java 37cf0e2d74589cfa97fa24c9d2d8d00ea62390ee 
>   ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 5de2c3f1244b8340b97eb0547fe66e52d80fb065 
> 
> 
> Diff: https://reviews.apache.org/r/57728/diff/3/
> 
> 
> Testing
> -------
> 
> Tested loading timestamps from a parquet file written by spark.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>


Re: Review Request 57728: HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767

Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57728/#review170236
-----------------------------------------------------------


Ship it!




Ship It!

- Sergio Pena


On March 27, 2017, 8 a.m., Barna Zsombor Klara wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57728/
> -----------------------------------------------------------
> 
> (Updated March 27, 2017, 8 a.m.)
> 
> 
> Review request for hive and Sergio Pena.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 26f1e75c7d659a634cd4eef3a0cb8e886b22722f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 8e33b7d437894b33b35f32913a3bc02f2a849ce3 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 5dc808800290f3274afbdff12134ac34387a746b 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java 37cf0e2d74589cfa97fa24c9d2d8d00ea62390ee 
>   ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 5de2c3f1244b8340b97eb0547fe66e52d80fb065 
> 
> 
> Diff: https://reviews.apache.org/r/57728/diff/4/
> 
> 
> Testing
> -------
> 
> Tested loading timestamps from a parquet file written by spark.
> 
> 
> Thanks,
> 
> Barna Zsombor Klara
> 
>


Re: Review Request 57728: HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767

Posted by Barna Zsombor Klara <zs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57728/
-----------------------------------------------------------

(Updated March 27, 2017, 8 a.m.)


Review request for hive and Sergio Pena.


Changes
-------

Removed empty line from DateUtils.


Repository: hive-git


Description
-------

HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 26f1e75c7d659a634cd4eef3a0cb8e886b22722f 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 8e33b7d437894b33b35f32913a3bc02f2a849ce3 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 5dc808800290f3274afbdff12134ac34387a746b 
  ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java 37cf0e2d74589cfa97fa24c9d2d8d00ea62390ee 
  ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 5de2c3f1244b8340b97eb0547fe66e52d80fb065 


Diff: https://reviews.apache.org/r/57728/diff/4/

Changes: https://reviews.apache.org/r/57728/diff/3-4/


Testing
-------

Tested loading timestamps from a parquet file written by spark.


Thanks,

Barna Zsombor Klara


Re: Review Request 57728: HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767

Posted by Barna Zsombor Klara <zs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57728/
-----------------------------------------------------------

(Updated March 24, 2017, 9:56 a.m.)


Review request for hive and Sergio Pena.


Changes
-------

Move the time zone checking utility method into NanotTimeUtils.


Repository: hive-git


Description
-------

HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767


Diffs (updated)
-----

  common/src/java/org/apache/hive/common/util/DateUtils.java a1068ecce94e9ff1ae78008a0d8c6d67ca4f2690 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 26f1e75c7d659a634cd4eef3a0cb8e886b22722f 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 8e33b7d437894b33b35f32913a3bc02f2a849ce3 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 5dc808800290f3274afbdff12134ac34387a746b 
  ql/src/test/org/apache/hadoop/hive/ql/io/parquet/timestamp/TestNanoTimeUtils.java 37cf0e2d74589cfa97fa24c9d2d8d00ea62390ee 
  ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 5de2c3f1244b8340b97eb0547fe66e52d80fb065 


Diff: https://reviews.apache.org/r/57728/diff/3/

Changes: https://reviews.apache.org/r/57728/diff/2-3/


Testing
-------

Tested loading timestamps from a parquet file written by spark.


Thanks,

Barna Zsombor Klara


Re: Review Request 57728: HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767

Posted by Barna Zsombor Klara <zs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57728/
-----------------------------------------------------------

(Updated March 21, 2017, 5:28 p.m.)


Review request for hive and Sergio Pena.


Changes
-------

Refactored timezone check into a separate method in DateUtils.


Repository: hive-git


Description
-------

HIVE-16231: Parquet timestamp may be stored differently since HIVE-12767


Diffs (updated)
-----

  common/src/java/org/apache/hive/common/util/DateUtils.java a1068ecce94e9ff1ae78008a0d8c6d67ca4f2690 
  common/src/test/org/apache/hive/common/util/TestDateUtils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 26f1e75c7d659a634cd4eef3a0cb8e886b22722f 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 8e33b7d437894b33b35f32913a3bc02f2a849ce3 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java 5dc808800290f3274afbdff12134ac34387a746b 
  ql/src/test/queries/clientpositive/parquet_int96_timestamp.q 5de2c3f1244b8340b97eb0547fe66e52d80fb065 


Diff: https://reviews.apache.org/r/57728/diff/2/

Changes: https://reviews.apache.org/r/57728/diff/1-2/


Testing
-------

Tested loading timestamps from a parquet file written by spark.


Thanks,

Barna Zsombor Klara