You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/02/26 22:46:18 UTC

[GitHub] [iceberg] edgarRd commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

edgarRd commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r583966003



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/hour/min/sec/nanos to generate the object
   private static int daysFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.daysFromInstant(timestamp.toInstant());
+    return DateTimeUtil.daysFromDate(timestamp.toLocalDateTime().toLocalDate());
   }
 
+  // We have to use the LocalDateTime to get the micros. See the comment above.
   private static long microsFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.microsFromInstant(timestamp.toInstant());
+    return DateTimeUtil.microsFromTimestamp(timestamp.toLocalDateTime());

Review comment:
       @pvary This change breaks test:
   ```
   ./gradlew clean :iceberg-mr:test --tests org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory
   ...
   > Task :iceberg-mr:test FAILED
   
   org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory > testTimestampType FAILED
       java.lang.AssertionError: expected:<1349154977123456> but was:<1349129777123456>
           at org.junit.Assert.fail(Assert.java:88)
           at org.junit.Assert.failNotEquals(Assert.java:834)
           at org.junit.Assert.assertEquals(Assert.java:118)
           at org.junit.Assert.assertEquals(Assert.java:144)
           at org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory.assertPredicatesMatch(TestHiveIcebergFilterFactory.java:268)
           at org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory.testTimestampType(TestHiveIcebergFilterFactory.java:248)
   
   16 tests completed, 1 failed
   ```
   When run in non-UTC environments. I assume the test may need to change to adjust to the changes being made in https://github.com/apache/iceberg/pull/2278 to handle predicate pushdown for Timestamp.withZone().
   
   I'm surprised this is not caught by the CI checks, but maybe the CI runs in UTC - is there a way that we can run the tests in a few additional Timezones to validate?
   
   Thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org