You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/06/19 07:24:23 UTC

[GitHub] [hive] guptanikhil007 opened a new pull request #2409: [WIP] HIVE-25268: Restore the functionality of date_format UDF

guptanikhil007 opened a new pull request #2409:
URL: https://github.com/apache/hive/pull/2409


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://cwiki.apache.org/confluence/display/Hive/HowToContribute
     2. Ensure that you have created an issue on the Hive project JIRA: https://issues.apache.org/jira/projects/HIVE/summary
     3. Ensure you have added or run the appropriate tests for your PR: 
     4. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]HIVE-XXXXX:  Your PR title ...'.
     5. Be sure to keep the PR description updated to reflect all changes.
     6. Please write your PR title to summarize what this PR proposes.
     7. If possible, provide a concise example to reproduce the issue for a faster review.
   
   -->
   
   ### What changes were proposed in this pull request?
   Use new java time APIs to get the result instead of older APIs to restore functionality
   
   
   ### Why are the changes needed?
   date_format UDF gives 
   1. incorrect time values for dates prior to 1901-01-01 00:00:00 if the timezone is not UTC
   2. incorrect date and time values for dates prior to October 15, 1582 (Default Gregorian Calendar change)
   These changes fix both of the above issues
   
   
   ### Does this PR introduce _any_ user-facing change?
   The functionality is restored for the date_format UDF
   
   
   ### How was this patch tested?
   1. Unit tests
   2. Local tests with MiniHS2 functionality
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655101614



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -85,21 +87,31 @@ public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumen
       String fmtStr = getConstantStringValue(arguments, 1);
       if (fmtStr != null) {
         try {
-          formatter = new SimpleDateFormat(fmtStr);
-          formatter.setCalendar(DateTimeMath.getTimeZonedProlepticGregorianCalendar());
+          if (timeZone == null) {
+            timeZone = SessionState.get() == null ? new HiveConf().getLocalTimeZone() : SessionState.get().getConf()
+                .getLocalTimeZone();
+          }
+          formatter = DateTimeFormatter.ofPattern(fmtStr).withZone(timeZone);
         } catch (IllegalArgumentException e) {
           // ignore
         }
       }
     } else {
-      throw new UDFArgumentTypeException(1, getFuncName() + " only takes constant as "
-          + getArgOrder(1) + " argument");
+      throw new UDFArgumentTypeException(1, getFuncName() + " only takes constant as " + getArgOrder(1) + " argument");
     }
 
     ObjectInspector outputOI = PrimitiveObjectInspectorFactory.writableStringObjectInspector;
     return outputOI;
   }
 
+  @Override

Review comment:
       Removed and verified




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655886389



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r658476821



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       @sankarh @zabetak 
   All the timestamps are considered to be in local time zone in both Hive 1.2 and Hive master (patched).
   Both Hive 1.2 and 3.1 have similar results for the date_format function.
   E.g.: 
   Hive 1.2.1 (with timezone set to Asia/Bangkok)
   ```
   0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select * from test_tbl_orc;
   +------------------------------+--------------------------+--+
   | test_tbl_orc.clusterversion  |    test_tbl_orc.col2     |
   +------------------------------+--------------------------+--+
   | Hive 4.0                     | 1800-01-14 01:01:10.123  |
   +------------------------------+--------------------------+--+
   1 row selected (0.372 seconds)
   0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> desc test_tbl_orc;
   +-----------------+------------+----------+--+
   |    col_name     | data_type  | comment  |
   +-----------------+------------+----------+--+
   | clusterversion  | string     |          |
   | col2            | timestamp  |          |
   +-----------------+------------+----------+--+
   2 rows selected (0.65 seconds)
   0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select date_format(col2, "yyyy-MM-dd HH:mm:ss.SSS z") from test_tbl_orc;
   +------------------------------+--+
   |             _c0              |
   +------------------------------+--+
   | 1800-01-14 01:01:10.123 ICT  |
   +------------------------------+--+
   ```
   Hive master (using MiniHS2 and same orc file from Hive 1.2)
   ```
   0: jdbc:hive2://localhost:10000/> select * from test_tbl_orc;
   +------------------------------+--------------------------+
   | test_tbl_orc.clusterversion  |    test_tbl_orc.col2     |
   +------------------------------+--------------------------+
   | Hive 4.0                     | 1800-01-14 01:01:10.123  |
   +------------------------------+--------------------------+
   1 row selected (0.23 seconds)
   0: jdbc:hive2://localhost:10000/> desc test_tbl_orc;
   +-----------------+------------+----------+
   |    col_name     | data_type  | comment  |
   +-----------------+------------+----------+
   | clusterversion  | string     |          |
   | col2            | timestamp  |          |
   +-----------------+------------+----------+
   2 rows selected (0.24 seconds)
   0: jdbc:hive2://localhost:10000/> set hive.local.time.zone=Asia/Bangkok;
   No rows affected (0.102 seconds)
   0: jdbc:hive2://localhost:10000/> select date_format(col2, "yyyy-MM-dd HH:mm:ss.SSS z") from test_tbl_orc;
   +------------------------------+
   |             _c0              |
   +------------------------------+
   | 1800-01-14 01:01:10.123 ICT  |
   +------------------------------+
   1 row selected (0.261 seconds)
   0: jdbc:hive2://localhost:10000/> set hive.local.time.zone=LOCAL;
   No rows affected (0.036 seconds)
   0: jdbc:hive2://localhost:10000/> select date_format(col2, "yyyy-MM-dd HH:mm:ss.SSS z") from test_tbl_orc;
   +------------------------------+
   |             _c0              |
   +------------------------------+
   | 1800-01-14 01:01:10.123 PST  |
   +------------------------------+
   1 row selected (0.492 seconds)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sankarh commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
sankarh commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r658156023



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -51,18 +50,17 @@
  */
 @Description(name = "date_format", value = "_FUNC_(date/timestamp/string, fmt) - converts a date/timestamp/string "
     + "to a value of string in the format specified by the date format fmt.",
-    extended = "Supported formats are SimpleDateFormat formats - "
-        + "https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. "
+    extended = "Supported formats are DateTimeFormatter formats - "
+        + "https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html. "

Review comment:
       @zabetak I think, if these 2 are the only change, then we can go ahead with DateTimeFormatter. What is your opinion?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sankarh commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
sankarh commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r658162341



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       @guptanikhil007 
   I think, @zabetak point is what will be the interpretation of timezone for a timestamp value stored in a table. Will it be treated as UTC or local timezone? If local timezone, then will the output changes when we change the local timezone config?
   From the code, it seems `yes`. But, I guess, the behavior is same with `SimpleDateFormat `as well. Pls confirm.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
zabetak commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r658755973



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -51,18 +50,17 @@
  */
 @Description(name = "date_format", value = "_FUNC_(date/timestamp/string, fmt) - converts a date/timestamp/string "
     + "to a value of string in the format specified by the date format fmt.",
-    extended = "Supported formats are SimpleDateFormat formats - "
-        + "https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. "
+    extended = "Supported formats are DateTimeFormatter formats - "
+        + "https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html. "

Review comment:
       Personally, I am in favor of using the new API. However, I am sure that a user/client will discover that they get different results and may come back to us complaining about it. Still I think it is a good idea to use the new `DateTimeFormatter` so I would suggest to clearly indicate this small change of behavior in the JIRA and move forward.
   
   Other than that since we are enhancing the API of `date_format` we should probably consider adding a few more tests in `TestGenericUDFDateFormat` with the supported patterns (could be done in a follow up I guess).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r654887096



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));

Review comment:
       No need to convert timeZone to UTC and UTC to timeZone. Timestamp class internally use localDateTime which is timezone less and only depict the timestamp.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r657244949



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -74,30 +72,28 @@ public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumen
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     checkArgGroups(arguments, 0, tsInputTypes, STRING_GROUP, DATE_GROUP);
-    checkArgGroups(arguments, 0, dtInputTypes, STRING_GROUP, DATE_GROUP);
-
     checkArgGroups(arguments, 1, tsInputTypes, STRING_GROUP);
 
     obtainTimestampConverter(arguments, 0, tsInputTypes, tsConverters);
-    obtainDateConverter(arguments, 0, dtInputTypes, dtConverters);
 
     if (arguments[1] instanceof ConstantObjectInspector) {
       String fmtStr = getConstantStringValue(arguments, 1);
       if (fmtStr != null) {
         try {
-          formatter = new SimpleDateFormat(fmtStr);
-          formatter.setCalendar(DateTimeMath.getTimeZonedProlepticGregorianCalendar());
+          if (timeZone == null) {
+            timeZone = SessionState.get() == null ? new HiveConf().getLocalTimeZone() : SessionState.get().getConf()
+                .getLocalTimeZone();
+          }
+          formatter = DateTimeFormatter.ofPattern(fmtStr).withZone(timeZone);
         } catch (IllegalArgumentException e) {

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656846147



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       +1 to @sankarh comments. Existing test cases are fine. Could please add some more time zone based test cases. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: [WIP] HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r654873293



##########
File path: ql/src/test/results/clientpositive/llap/udf_date_format.q.out
##########
@@ -149,13 +149,13 @@ POSTHOOK: Input: _dummy_database@_dummy_table
 10	30	45	10 AM	08	123	123	NULL
 PREHOOK: query: select
 date_format('2015-04-08', ''),
-date_format('2015-04-08', 'Q')

Review comment:
       Same change is done for Unit test as well




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656848281



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       +1 for @zabetak comment. For that we need to add the timeZone  to localDateTime before calling the formatter. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656854401



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));

Review comment:
       Comment not required




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 edited a comment on pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 edited a comment on pull request #2409:
URL: https://github.com/apache/hive/pull/2409#issuecomment-864497164


   @zabetak @jcamachor @sankarh Please review the change in the APIs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
zabetak commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r658765479



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       Thanks for testing this @guptanikhil007 .




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on pull request #2409:
URL: https://github.com/apache/hive/pull/2409#issuecomment-866960296


   @ashish-kumar-sharma Thanks for the new implementation logic.
   I have tested and it gives the same results


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r657609683



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -106,25 +102,22 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
       return null;
     }
 
-    ZoneId id = (SessionState.get() == null) ? new HiveConf().getLocalTimeZone() : SessionState.get().getConf()
-        .getLocalTimeZone();
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
-      Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
-      if (d == null) {
-        return null;
-      }
-      ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
+      return null;
     }
 
+    // We are doing an extra timestamp conversion from timeZone to UTC and then UTC to timeZone to maintain the
+    // same zone format semantics as that of SimpleDateFormat for some specific locale. E.g. PST/PDT, CST
+    // If this conversion is not there PDT -> PT and CST -> CT

Review comment:
       removed ts2




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sankarh commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
sankarh commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655565647



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       Do we have tests for other formats (to ensure DateTimeFormat doesn't break anything)? Also need update of wiki doc.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));

Review comment:
       Add comments on why this conversion is needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sankarh commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
sankarh commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655565647



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       Do we have tests for other formats (to ensure DateTimeFormat doesn't break anything)? Also need update of wiki doc.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));

Review comment:
       Add comments on why this conversion is needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sankarh commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
sankarh commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r658156023



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -51,18 +50,17 @@
  */
 @Description(name = "date_format", value = "_FUNC_(date/timestamp/string, fmt) - converts a date/timestamp/string "
     + "to a value of string in the format specified by the date format fmt.",
-    extended = "Supported formats are SimpleDateFormat formats - "
-        + "https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. "
+    extended = "Supported formats are DateTimeFormatter formats - "
+        + "https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html. "

Review comment:
       @zabetak I think, if these 2 are the only difference, then we can go ahead with DateTimeFormatter. What is your opinion?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r657244506



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -74,30 +72,28 @@ public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumen
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     checkArgGroups(arguments, 0, tsInputTypes, STRING_GROUP, DATE_GROUP);
-    checkArgGroups(arguments, 0, dtInputTypes, STRING_GROUP, DATE_GROUP);
-
     checkArgGroups(arguments, 1, tsInputTypes, STRING_GROUP);
 
     obtainTimestampConverter(arguments, 0, tsInputTypes, tsConverters);
-    obtainDateConverter(arguments, 0, dtInputTypes, dtConverters);
 
     if (arguments[1] instanceof ConstantObjectInspector) {
       String fmtStr = getConstantStringValue(arguments, 1);
       if (fmtStr != null) {
         try {
-          formatter = new SimpleDateFormat(fmtStr);
-          formatter.setCalendar(DateTimeMath.getTimeZonedProlepticGregorianCalendar());
+          if (timeZone == null) {

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sankarh commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
sankarh commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r658025667



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -106,25 +102,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
       return null;
     }
 
-    ZoneId id = (SessionState.get() == null) ? new HiveConf().getLocalTimeZone() : SessionState.get().getConf()
-        .getLocalTimeZone();
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
-      Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
-      if (d == null) {
-        return null;
-      }
-      ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
+      return null;

Review comment:
       Why DateConverter is removed? Isn't it needed to convert input like "2021-06-24" or does it work with TimestampCoverter?

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -17,31 +17,30 @@
  */
 package org.apache.hadoop.hive.ql.udf.generic;
 
-import static org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.PrimitiveGrouping.DATE_GROUP;
-import static org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.PrimitiveGrouping.STRING_GROUP;
-
-import java.text.SimpleDateFormat;
-import java.time.Instant;
-import java.time.LocalDateTime;
-import java.time.ZoneId;
-import java.time.ZoneOffset;
-
-import org.apache.hadoop.hive.common.type.Date;
 import org.apache.hadoop.hive.common.type.Timestamp;
+import org.apache.hadoop.hive.common.type.TimestampTZUtil;

Review comment:
       Unused class. Can be removed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656846147



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       +1 to @sankarh comments. Existing test cases are fine. Could please add some more time zone based test cases.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656843576



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));

Review comment:
       @guptanikhil007 @sankarh  We don't have to convert timeZone to UTC and UTC to timeZone. Since entire computation is done on localDateTime which is timezone less. So we just need to add the Time-zone to localDateTime.
   
   We can do that as follows - 
   
   formatter.format(ZonedDateTime.ofInstant(Instant.ofEpochSecond(ts.toEpochSecond(), ts.getNanos()), ZoneOffset.UTC)
           .withZoneSameLocal(timeZone))
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r658476821



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       All the timestamps are considered to be in local time zone in both Hive 1.2 and Hive master (patched).
   Both Hive 1.2 and 3.1 have similar results for the date_format function.
   E.g.: 
   Hive 1.2.1 (with timezone set to Asia/Bangkok)
   ```
   0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select * from test_tbl_orc;
   +------------------------------+--------------------------+--+
   | test_tbl_orc.clusterversion  |    test_tbl_orc.col2     |
   +------------------------------+--------------------------+--+
   | Hive 4.0                     | 1800-01-14 01:01:10.123  |
   +------------------------------+--------------------------+--+
   1 row selected (0.372 seconds)
   0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> desc test_tbl_orc;
   +-----------------+------------+----------+--+
   |    col_name     | data_type  | comment  |
   +-----------------+------------+----------+--+
   | clusterversion  | string     |          |
   | col2            | timestamp  |          |
   +-----------------+------------+----------+--+
   2 rows selected (0.65 seconds)
   0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select date_format(col2, "yyyy-MM-dd HH:mm:ss.SSS z") from test_tbl_orc;
   +------------------------------+--+
   |             _c0              |
   +------------------------------+--+
   | 1800-01-14 01:01:10.123 ICT  |
   +------------------------------+--+
   ```
   Hive master (using MiniHS2 and same orc file from Hive 1.2)
   ```
   0: jdbc:hive2://localhost:10000/> select * from test_tbl_orc;
   +------------------------------+--------------------------+
   | test_tbl_orc.clusterversion  |    test_tbl_orc.col2     |
   +------------------------------+--------------------------+
   | Hive 4.0                     | 1800-01-14 01:01:10.123  |
   +------------------------------+--------------------------+
   1 row selected (0.23 seconds)
   0: jdbc:hive2://localhost:10000/> desc test_tbl_orc;
   +-----------------+------------+----------+
   |    col_name     | data_type  | comment  |
   +-----------------+------------+----------+
   | clusterversion  | string     |          |
   | col2            | timestamp  |          |
   +-----------------+------------+----------+
   2 rows selected (0.24 seconds)
   0: jdbc:hive2://localhost:10000/> set hive.local.time.zone=Asia/Bangkok;
   No rows affected (0.102 seconds)
   0: jdbc:hive2://localhost:10000/> select date_format(col2, "yyyy-MM-dd HH:mm:ss.SSS z") from test_tbl_orc;
   +------------------------------+
   |             _c0              |
   +------------------------------+
   | 1800-01-14 01:01:10.123 ICT  |
   +------------------------------+
   1 row selected (0.261 seconds)
   0: jdbc:hive2://localhost:10000/> set hive.local.time.zone=LOCAL;
   No rows affected (0.036 seconds)
   0: jdbc:hive2://localhost:10000/> select date_format(col2, "yyyy-MM-dd HH:mm:ss.SSS z") from test_tbl_orc;
   +------------------------------+
   |             _c0              |
   +------------------------------+
   | 1800-01-14 01:01:10.123 PST  |
   +------------------------------+
   1 row selected (0.492 seconds)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sankarh merged pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
sankarh merged pull request #2409:
URL: https://github.com/apache/hive/pull/2409


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655101432



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,8 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900 in another time zone
+set hive.local.time.zone=Asia/Bangkok;

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: [WIP] HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r654873185



##########
File path: ql/src/test/results/clientpositive/llap/udf_date_format.q.out
##########
@@ -149,13 +149,13 @@ POSTHOOK: Input: _dummy_database@_dummy_table
 10	30	45	10 AM	08	123	123	NULL
 PREHOOK: query: select
 date_format('2015-04-08', ''),
-date_format('2015-04-08', 'Q')

Review comment:
       `Q` is a valid format identifier in DateTimeFormatter => Quarter of the year




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655100775



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));

Review comment:
       Conversion is required. I have tested without conversion and we get wrong results




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r654888583



##########
File path: ql/src/test/results/clientpositive/llap/udf_date_format.q.out
##########
@@ -8,7 +8,7 @@ PREHOOK: type: DESCFUNCTION
 POSTHOOK: query: DESC FUNCTION EXTENDED date_format
 POSTHOOK: type: DESCFUNCTION
 date_format(date/timestamp/string, fmt) - converts a date/timestamp/string to a value of string in the format specified by the date format fmt.
-Supported formats are SimpleDateFormat formats - https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. Second argument fmt should be constant.
+Supported formats are DateTimeFormatter formats - https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html. Second argument fmt should be constant.

Review comment:
       Do these changes in "ql/src/test/queries/clientpositive/udf_date_format.q " as well




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655101320



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655134815



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));
+    Instant instant = Instant.ofEpochSecond(ts2.toEpochSecond(), ts2.getNanos());
+    ZonedDateTime zonedDateTime = ZonedDateTime.ofInstant(instant, ZoneOffset.UTC);
+    String res = formatter.format(zonedDateTime);

Review comment:
       The timezone gets converted for some specific locale's in case we don't do this conversion:
   PDT -> PT
   CST -> CT
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656466237



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -51,18 +50,17 @@
  */
 @Description(name = "date_format", value = "_FUNC_(date/timestamp/string, fmt) - converts a date/timestamp/string "
     + "to a value of string in the format specified by the date format fmt.",
-    extended = "Supported formats are SimpleDateFormat formats - "
-        + "https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. "
+    extended = "Supported formats are DateTimeFormatter formats - "
+        + "https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html. "

Review comment:
       I have verified that all the previous test cases work except following
   1. `S` gives fraction-of-second instead of milliseconds. (for milliseconds you need `SSS` instead of `S`)
   2. `u` gives year instead of Day number of week (1 = Monday, ..., 7 = Sunday). (Use `e` in newer format) 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sankarh commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
sankarh commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656738701



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -74,30 +72,28 @@ public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumen
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     checkArgGroups(arguments, 0, tsInputTypes, STRING_GROUP, DATE_GROUP);
-    checkArgGroups(arguments, 0, dtInputTypes, STRING_GROUP, DATE_GROUP);
-
     checkArgGroups(arguments, 1, tsInputTypes, STRING_GROUP);
 
     obtainTimestampConverter(arguments, 0, tsInputTypes, tsConverters);
-    obtainDateConverter(arguments, 0, dtInputTypes, dtConverters);
 
     if (arguments[1] instanceof ConstantObjectInspector) {
       String fmtStr = getConstantStringValue(arguments, 1);
       if (fmtStr != null) {
         try {
-          formatter = new SimpleDateFormat(fmtStr);
-          formatter.setCalendar(DateTimeMath.getTimeZonedProlepticGregorianCalendar());
+          if (timeZone == null) {

Review comment:
       Remove unused method DateTimeMath.getTimeZonedProlepticGregorianCalendar().




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r657244719



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -85,21 +87,31 @@ public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumen
       String fmtStr = getConstantStringValue(arguments, 1);
       if (fmtStr != null) {
         try {
-          formatter = new SimpleDateFormat(fmtStr);
-          formatter.setCalendar(DateTimeMath.getTimeZonedProlepticGregorianCalendar());
+          if (timeZone == null) {
+            timeZone = SessionState.get() == null ? new HiveConf().getLocalTimeZone() : SessionState.get().getConf()

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on pull request #2409:
URL: https://github.com/apache/hive/pull/2409#issuecomment-868785976


   Rebased on latest master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: [WIP] HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r654873108



##########
File path: ql/src/test/results/clientpositive/llap/udf_date_format.q.out
##########
@@ -92,7 +92,7 @@ date_format(cast(null as string), 'dd')
 POSTHOOK: type: QUERY
 POSTHOOK: Input: _dummy_database@_dummy_table
 #### A masked pattern was here ####
-10	30	45	09 PM	08	123	08	08	NULL
+10	30	45	09 PM	08	1	08	08	NULL

Review comment:
       There is a change here as `SSS` means milliseconds in DateTimeFormatter  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] zabetak commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
zabetak commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656307022



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -51,18 +50,17 @@
  */
 @Description(name = "date_format", value = "_FUNC_(date/timestamp/string, fmt) - converts a date/timestamp/string "
     + "to a value of string in the format specified by the date format fmt.",
-    extended = "Supported formats are SimpleDateFormat formats - "
-        + "https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. "
+    extended = "Supported formats are DateTimeFormatter formats - "
+        + "https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html. "

Review comment:
       This could be a breaking change. Did you verify that all the patterns supported in `SimpleDateFormat` are provided also by `DateTimeFormatter`?

##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       Do we have tests where the input to `date_format` is a timestamp and not a string literal. In that case what should be the result of `date_format`? 
   
   Should it change when the local timezone changes or it should always be the same? I am asking cause if I remember well the data type that is used when we store timestamps in tables is `TIMESTAMP WITHOUT TIMEZONE` so in principle equivalent to `LocalDateTime`. From a Java perspective (using `DateTimeFormatter`) requesting formatting with timezone from a timestamp without timezone would be an error. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655886832



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       Once this patch is merged I will update the Hive wiki as well




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655101499



##########
File path: ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFDateFormat.java
##########
@@ -176,7 +176,6 @@ public void testWrongFmt() throws HiveException {
 

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r654886574



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -85,21 +87,31 @@ public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumen
       String fmtStr = getConstantStringValue(arguments, 1);
       if (fmtStr != null) {
         try {
-          formatter = new SimpleDateFormat(fmtStr);
-          formatter.setCalendar(DateTimeMath.getTimeZonedProlepticGregorianCalendar());
+          if (timeZone == null) {
+            timeZone = SessionState.get() == null ? new HiveConf().getLocalTimeZone() : SessionState.get().getConf()

Review comment:
       Combine redundant code at line 121.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));

Review comment:
       No need to convert first convert timeZone to UTC and UTC to timeZone. Timestamp class internally use localDateTime which is timezone less and only depict the timestamp.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);

Review comment:
       You can also remove this line because it is already taken care as part of timestamp.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));
+    Instant instant = Instant.ofEpochSecond(ts2.toEpochSecond(), ts2.getNanos());
+    ZonedDateTime zonedDateTime = ZonedDateTime.ofInstant(instant, ZoneOffset.UTC);
+    String res = formatter.format(zonedDateTime);

Review comment:
       Instead use ts.format(formatter)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r654876567



##########
File path: ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFDateFormat.java
##########
@@ -176,7 +176,6 @@ public void testWrongFmt() throws HiveException {
 

Review comment:
       Add test for date older then 1900 with timezone.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655100944



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));
+    Instant instant = Instant.ofEpochSecond(ts2.toEpochSecond(), ts2.getNanos());
+    ZonedDateTime zonedDateTime = ZonedDateTime.ofInstant(instant, ZoneOffset.UTC);
+    String res = formatter.format(zonedDateTime);

Review comment:
       Wrong results without conversion




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on pull request #2409:
URL: https://github.com/apache/hive/pull/2409#issuecomment-866635974


   @guptanikhil007 i have added few comments on all ready resolved conversation. Please have a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r658040342



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -106,25 +102,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
       return null;
     }
 
-    ZoneId id = (SessionState.get() == null) ? new HiveConf().getLocalTimeZone() : SessionState.get().getConf()
-        .getLocalTimeZone();
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
-      Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
-      if (d == null) {
-        return null;
-      }
-      ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
+      return null;

Review comment:
       It works with Timestamp converter

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -17,31 +17,30 @@
  */
 package org.apache.hadoop.hive.ql.udf.generic;
 
-import static org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.PrimitiveGrouping.DATE_GROUP;
-import static org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.PrimitiveGrouping.STRING_GROUP;
-
-import java.text.SimpleDateFormat;
-import java.time.Instant;
-import java.time.LocalDateTime;
-import java.time.ZoneId;
-import java.time.ZoneOffset;
-
-import org.apache.hadoop.hive.common.type.Date;
 import org.apache.hadoop.hive.common.type.Timestamp;
+import org.apache.hadoop.hive.common.type.TimestampTZUtil;

Review comment:
       ok
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656856299



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -74,30 +72,28 @@ public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumen
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     checkArgGroups(arguments, 0, tsInputTypes, STRING_GROUP, DATE_GROUP);
-    checkArgGroups(arguments, 0, dtInputTypes, STRING_GROUP, DATE_GROUP);
-
     checkArgGroups(arguments, 1, tsInputTypes, STRING_GROUP);
 
     obtainTimestampConverter(arguments, 0, tsInputTypes, tsConverters);
-    obtainDateConverter(arguments, 0, dtInputTypes, dtConverters);
 
     if (arguments[1] instanceof ConstantObjectInspector) {
       String fmtStr = getConstantStringValue(arguments, 1);
       if (fmtStr != null) {
         try {
-          formatter = new SimpleDateFormat(fmtStr);
-          formatter.setCalendar(DateTimeMath.getTimeZonedProlepticGregorianCalendar());
+          if (timeZone == null) {
+            timeZone = SessionState.get() == null ? new HiveConf().getLocalTimeZone() : SessionState.get().getConf()
+                .getLocalTimeZone();
+          }
+          formatter = DateTimeFormatter.ofPattern(fmtStr).withZone(timeZone);
         } catch (IllegalArgumentException e) {

Review comment:
       .withzone(timeZone) not required as we already converting the incoming argument to time zone provided in "hive.local.time.zone". Reconverting the timestamp value from same zone doesn't make any sense. 

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));

Review comment:
       Not required




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on pull request #2409:
URL: https://github.com/apache/hive/pull/2409#issuecomment-864497164


   @zabetak @jcamachor @sankarh Requires review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656845083



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);

Review comment:
       Also remove unused methods - 
   
   Date.toEpochMilli(ZoneId id)
   timestamp.toEpochMilli(ZoneId id)
   timestamp.ofEpochMilli(long epochMilli, ZoneId id)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655134815



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));
+    Instant instant = Instant.ofEpochSecond(ts2.toEpochSecond(), ts2.getNanos());
+    ZonedDateTime zonedDateTime = ZonedDateTime.ofInstant(instant, ZoneOffset.UTC);
+    String res = formatter.format(zonedDateTime);

Review comment:
       The timezone gets converted for some specific locale's in case we don't do this conversion:
   PDT -> PT
   CST -> CT
   

##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       All the existing tests with SimpleDateFormat Formatter is passing except the milliseconds change which I have mentioned in my comment.

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));

Review comment:
       done

##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       Once this patch is merged I will update the Hive wiki as well




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656852322



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -111,17 +123,18 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
       Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
       if (d == null) {
         return null;
       }
       ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
     }
-
-
-    date.setTime(ts.toEpochMilli(id));
-    String res = formatter.format(date);
+    Timestamp ts2 = TimestampTZUtil.convertTimestampToZone(ts, timeZone, ZoneId.of("UTC"));
+    Instant instant = Instant.ofEpochSecond(ts2.toEpochSecond(), ts2.getNanos());
+    ZonedDateTime zonedDateTime = ZonedDateTime.ofInstant(instant, ZoneOffset.UTC);
+    String res = formatter.format(zonedDateTime);

Review comment:
       You can create a new function like formatWithTimeZone(formatter, ZoneId) and add timeZone value to localDateTime.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: [WIP] HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r654873108



##########
File path: ql/src/test/results/clientpositive/llap/udf_date_format.q.out
##########
@@ -92,7 +92,7 @@ date_format(cast(null as string), 'dd')
 POSTHOOK: type: QUERY
 POSTHOOK: Input: _dummy_database@_dummy_table
 #### A masked pattern was here ####
-10	30	45	09 PM	08	123	08	08	NULL
+10	30	45	09 PM	08	1	08	08	NULL

Review comment:
       There is a change here as `SSS` means milli seconds in DateTimeFormatter  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] sankarh commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
sankarh commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656739080



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -106,25 +102,22 @@ public Object evaluate(DeferredObject[] arguments) throws HiveException {
       return null;
     }
 
-    ZoneId id = (SessionState.get() == null) ? new HiveConf().getLocalTimeZone() : SessionState.get().getConf()
-        .getLocalTimeZone();
     // the function should support both short date and full timestamp format
     // time part of the timestamp should not be skipped
     Timestamp ts = getTimestampValue(arguments, 0, tsConverters);
+
     if (ts == null) {
-      Date d = getDateValue(arguments, 0, dtInputTypes, dtConverters);
-      if (d == null) {
-        return null;
-      }
-      ts = Timestamp.ofEpochMilli(d.toEpochMilli(id), id);
+      return null;
     }
 
+    // We are doing an extra timestamp conversion from timeZone to UTC and then UTC to timeZone to maintain the
+    // same zone format semantics as that of SimpleDateFormat for some specific locale. E.g. PST/PDT, CST
+    // If this conversion is not there PDT -> PT and CST -> CT

Review comment:
       Reuse "ts" itself instead of new ts2 object.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656466237



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -51,18 +50,17 @@
  */
 @Description(name = "date_format", value = "_FUNC_(date/timestamp/string, fmt) - converts a date/timestamp/string "
     + "to a value of string in the format specified by the date format fmt.",
-    extended = "Supported formats are SimpleDateFormat formats - "
-        + "https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. "
+    extended = "Supported formats are DateTimeFormatter formats - "
+        + "https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html. "

Review comment:
       I have verified that all the previous test cases work except following
   1. `S` gives fraction-of-second instead of milliseconds. (for milliseconds you need `SSS` instead of `S`)
   2. `u` gives year instead of Day number of week (1 = Monday, ..., 7 = Sunday). (Use e in newer format) 
   

##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       We have tests where we use cast function to get Timestamp as input data type.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656466285



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       We have tests where we use cast function to get Timestamp  and Date as input data type.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r658073147



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r654876476



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDateFormat.java
##########
@@ -85,21 +87,31 @@ public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumen
       String fmtStr = getConstantStringValue(arguments, 1);
       if (fmtStr != null) {
         try {
-          formatter = new SimpleDateFormat(fmtStr);
-          formatter.setCalendar(DateTimeMath.getTimeZonedProlepticGregorianCalendar());
+          if (timeZone == null) {
+            timeZone = SessionState.get() == null ? new HiveConf().getLocalTimeZone() : SessionState.get().getConf()
+                .getLocalTimeZone();
+          }
+          formatter = DateTimeFormatter.ofPattern(fmtStr).withZone(timeZone);
         } catch (IllegalArgumentException e) {
           // ignore
         }
       }
     } else {
-      throw new UDFArgumentTypeException(1, getFuncName() + " only takes constant as "
-          + getArgOrder(1) + " argument");
+      throw new UDFArgumentTypeException(1, getFuncName() + " only takes constant as " + getArgOrder(1) + " argument");
     }
 
     ObjectInspector outputOI = PrimitiveObjectInspectorFactory.writableStringObjectInspector;
     return outputOI;
   }
 
+  @Override

Review comment:
       This method is not required. Because idea is to first check the session for config "hive.local.time.zone" and then fall back to cluster default.

##########
File path: ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFDateFormat.java
##########
@@ -176,7 +176,6 @@ public void testWrongFmt() throws HiveException {
 

Review comment:
       Add test for date older then 1900 with timestamp.

##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,8 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900 in another time zone
+set hive.local.time.zone=Asia/Bangkok;

Review comment:
       Add multiple query with different time zone.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] ashish-kumar-sharma commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
ashish-kumar-sharma commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r656848281



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       +1 for @zabetak comment. For that we need to add the timeZone  to localDateTime before calling the formatter.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] guptanikhil007 commented on a change in pull request #2409: HIVE-25268: Restore the functionality of date_format UDF

Posted by GitBox <gi...@apache.org>.
guptanikhil007 commented on a change in pull request #2409:
URL: https://github.com/apache/hive/pull/2409#discussion_r655870639



##########
File path: ql/src/test/queries/clientpositive/udf_date_format.q
##########
@@ -78,3 +78,16 @@ select date_format("2015-04-08 10:30:45","yyyy-MM-dd HH:mm:ss.SSS z");
 --julian date
 set hive.local.time.zone=UTC;
 select date_format("1001-01-05","dd---MM--yyyy");
+
+--dates prior to 1900
+set hive.local.time.zone=Asia/Bangkok;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Europe/Berlin;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+select date_format('1800-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');
+
+set hive.local.time.zone=Africa/Johannesburg;
+select date_format('1400-01-14 01:01:10.123', 'yyyy-MM-dd HH:mm:ss.SSS z');

Review comment:
       All the existing tests with SimpleDateFormat Formatter is passing except the milliseconds change which I have mentioned in my comment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org