You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Marta Kuczora via Review Board <no...@reviews.apache.org> on 2019/07/03 10:51:20 UTC

Re: Review Request 70920: HIVE-21868: Vectorize CAST...FORMAT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70920/#review216334
-----------------------------------------------------------



Thanks a lot Karen for the patch!
I have some questions, but otherwise the change looks good to me.


common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
Line 223 (original), 224 (patched)
<https://reviews.apache.org/r/70920/#comment303500>

    Why did you change the type of this variable to ArrayList from List?



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
Lines 59 (patched)
<https://reviews.apache.org/r/70920/#comment303501>

    Do the CastDateToString, CastDateToChar and CastDateToVarchar udfs use this method, or is this just a typo and the CastDateToStringWithFormat, ... udfs use this?



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java
Line 200 (original), 202 (patched)
<https://reviews.apache.org/r/70920/#comment303502>

    Is the formattedOutput variable never going to be null after this change? If there is a scenario where it can be null, it will cause problems when trying to cast it.



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java
Line 217 (original), 220 (patched)
<https://reviews.apache.org/r/70920/#comment303503>

    The same question about being null (previous comment) applies to the t and d variable as well.


- Marta Kuczora


On June 26, 2019, 8:44 a.m., Karen Coppage wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70920/
> -----------------------------------------------------------
> 
> (Updated June 26, 2019, 8:44 a.m.)
> 
> 
> Review request for hive and Marta Kuczora.
> 
> 
> Bugs: HIVE-21868
>     https://issues.apache.org/jira/browse/HIVE-21868
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Vectorize UDFs for CAST (<TIMESTAMP/DATE> AS STRING/CHAR/VARCHAR FORMAT <STRING>) and CAST (<STRING/CHAR/VARCHAR> AS TIMESTAMP/DATE FORMAT <STRING>).
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java 4e024a357b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java fa9d1e9783 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToCharWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java dfa9f8a00d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToStringWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToVarCharWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDate.java a6dff12e1a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDateWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestamp.java b48b0136eb 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestampWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToCharWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToString.java adc3a9d7b9 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToStringWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToVarCharWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java 16742eee9b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorMathFunctions.java 663237739e 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java 58fd7b030e 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCastsWithFormat.java PRE-CREATION 
>   ql/src/test/queries/clientnegative/udf_cast_format_bad_pattern.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q 269edf6da6 
>   ql/src/test/results/clientnegative/udf_cast_format_bad_pattern.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out 4a502b9700 
> 
> 
> Diff: https://reviews.apache.org/r/70920/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>


Re: Review Request 70920: HIVE-21868: Vectorize CAST...FORMAT

Posted by Karen Coppage via Review Board <no...@reviews.apache.org>.

> On July 3, 2019, 10:51 a.m., Marta Kuczora wrote:
> > Thanks a lot Karen for the patch!
> > I have some questions, but otherwise the change looks good to me.

Thanks very much for the review!


> On July 3, 2019, 10:51 a.m., Marta Kuczora wrote:
> > common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
> > Line 223 (original), 224 (patched)
> > <https://reviews.apache.org/r/70920/diff/3/?file=2152277#file2152277line224>
> >
> >     Why did you change the type of this variable to ArrayList from List?

I thought it was necessary in order to make the class serializable but now I see it's not. I'll fix this.


> On July 3, 2019, 10:51 a.m., Marta Kuczora wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
> > Lines 59 (patched)
> > <https://reviews.apache.org/r/70920/diff/3/?file=2152280#file2152280line59>
> >
> >     Do the CastDateToString, CastDateToChar and CastDateToVarchar udfs use this method, or is this just a typo and the CastDateToStringWithFormat, ... udfs use this?

They do, all 3 classes eventually inherit from CastDateToString.


> On July 3, 2019, 10:51 a.m., Marta Kuczora wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java
> > Line 200 (original), 202 (patched)
> > <https://reviews.apache.org/r/70920/diff/3/?file=2152291#file2152291line202>
> >
> >     Is the formattedOutput variable never going to be null after this change? If there is a scenario where it can be null, it will cause problems when trying to cast it.

The output of format(Timestamp|Date) is a <StringBuilder>.toString() which I believe will never return null.


> On July 3, 2019, 10:51 a.m., Marta Kuczora wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java
> > Line 217 (original), 220 (patched)
> > <https://reviews.apache.org/r/70920/diff/3/?file=2152291#file2152291line220>
> >
> >     The same question about being null (previous comment) applies to the t and d variable as well.

This is not the case now but may be later on (if timestamp range of 0000-9999 is ever strictly enforced). I will include a null check here and above just in case.


- Karen


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70920/#review216334
-----------------------------------------------------------


On June 26, 2019, 8:44 a.m., Karen Coppage wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70920/
> -----------------------------------------------------------
> 
> (Updated June 26, 2019, 8:44 a.m.)
> 
> 
> Review request for hive and Marta Kuczora.
> 
> 
> Bugs: HIVE-21868
>     https://issues.apache.org/jira/browse/HIVE-21868
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Vectorize UDFs for CAST (<TIMESTAMP/DATE> AS STRING/CHAR/VARCHAR FORMAT <STRING>) and CAST (<STRING/CHAR/VARCHAR> AS TIMESTAMP/DATE FORMAT <STRING>).
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java 4e024a357b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java fa9d1e9783 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToCharWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java dfa9f8a00d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToStringWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToVarCharWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDate.java a6dff12e1a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDateWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestamp.java b48b0136eb 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestampWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToCharWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToString.java adc3a9d7b9 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToStringWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToVarCharWithFormat.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java 16742eee9b 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorMathFunctions.java 663237739e 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java 58fd7b030e 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCastsWithFormat.java PRE-CREATION 
>   ql/src/test/queries/clientnegative/udf_cast_format_bad_pattern.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q 269edf6da6 
>   ql/src/test/results/clientnegative/udf_cast_format_bad_pattern.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out 4a502b9700 
> 
> 
> Diff: https://reviews.apache.org/r/70920/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>