You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/02/24 02:21:22 UTC

[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r581566575



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +566,61 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date > '2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-24", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());
+  }
+
+  @Test
+  public void testTimestampQuery() throws IOException {

Review comment:
       For me this test case fails regardless of whether I keep the changes in `HiveIcebergFilterFactory` or not. Seems like this test is dependent on the timezone the tests run in. I was only able to make this test case work in UTC timezone. Can we run this test for multiple timezones? Previously we have uncovered issues after we started breaking the assumption that the code runs in UTC.
   https://github.com/apache/iceberg/blob/fab4a5f2db140fdb132205e78934a145e646758b/orc/src/test/java/org/apache/iceberg/orc/TestExpressionToSearchArgument.java#L117

##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -173,11 +172,11 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
   }
 
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());

Review comment:
       I am having a hard time understanding what behavioral impact this change has. Can you elaborate a bit? Timestamp issues always tend to be tricky for me to get.

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +566,61 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date > '2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-24", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());
+  }
+
+  @Test
+  public void testTimestampQuery() throws IOException {
+    Schema timestampSchema = new Schema(optional(1, "d_ts", Types.TimestampType.withoutZone()));

Review comment:
       Should we also include tests for `Timestamp.withZone()` along the same lines?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org