You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/02/19 17:19:10 UTC

[GitHub] [iceberg] pvary opened a new pull request #2254: Hive: Fix predicate pushdown for Date

pvary opened a new pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254


   When Hive query `SELECT * from date_test WHERE d_date='1998-02-19'` contains a date literal as a predicated Iceberg filter fails with the following exception:
   ```
   Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.IllegalStateException: Not an instance of java.lang.Integer: 2130-01-20
           at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:499)
           at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:307)
           at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:878)
           at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:559)
           at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:551)
           at org.apache.iceberg.mr.hive.TestHiveShell.executeStatement(TestHiveShell.java:143)
           ... 62 more
   Caused by: java.io.IOException: java.lang.IllegalStateException: Not an instance of java.lang.Integer: 2130-01-20
           at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
           at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)
           at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
           at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2208)
           at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:494)
           ... 67 more
   Caused by: java.lang.IllegalStateException: Not an instance of java.lang.Integer: 2130-01-20
           at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:123)
           at org.apache.iceberg.Accessors$PositionAccessor.get(Accessors.java:71)
           at org.apache.iceberg.Accessors$PositionAccessor.get(Accessors.java:58)
           at org.apache.iceberg.expressions.BoundReference.eval(BoundReference.java:39)
           at org.apache.iceberg.expressions.Evaluator$EvalVisitor.eq(Evaluator.java:132)
           at org.apache.iceberg.expressions.Evaluator$EvalVisitor.eq(Evaluator.java:52)
           at org.apache.iceberg.expressions.ExpressionVisitors$BoundVisitor.predicate(ExpressionVisitors.java:249)
           at org.apache.iceberg.expressions.ExpressionVisitors.visitEvaluator(ExpressionVisitors.java:346)
           at org.apache.iceberg.expressions.Evaluator$EvalVisitor.eval(Evaluator.java:57)
           at org.apache.iceberg.expressions.Evaluator$EvalVisitor.access$100(Evaluator.java:52)
           at org.apache.iceberg.expressions.Evaluator.eval(Evaluator.java:49)
           at org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.lambda$applyResidualFiltering$0(IcebergInputFormat.java:288)
           at org.apache.iceberg.io.CloseableIterable$3.shouldKeep(CloseableIterable.java:82)
           at org.apache.iceberg.io.FilterIterator.advance(FilterIterator.java:67)
           at org.apache.iceberg.io.FilterIterator.hasNext(FilterIterator.java:50)
           at org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextKeyValue(IcebergInputFormat.java:202)
           at org.apache.iceberg.mr.mapred.MapredIcebergInputFormat$MapredIcebergRecordReader.next(MapredIcebergInputFormat.java:104)
           at org.apache.iceberg.mr.mapred.MapredIcebergInputFormat$MapredIcebergRecordReader.next(MapredIcebergInputFormat.java:81)
           at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:488)
           ... 71 more
   ```
   
   There are 2 issue:
   1. Literal conversion is problematic as Hive uses Date with local timezone, so we have to fix the conversion
   2. When comparing the Record with the literal `GenericRecord.get()` expect the value to be as an Integer, but it gets LocalDate instead.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#issuecomment-786703516


   @shardulm94, @marton-bod: Added the fix for timestamp.withZone in #2278 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r584042222



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/hour/min/sec/nanos to generate the object
   private static int daysFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.daysFromInstant(timestamp.toInstant());
+    return DateTimeUtil.daysFromDate(timestamp.toLocalDateTime().toLocalDate());
   }
 
+  // We have to use the LocalDateTime to get the micros. See the comment above.
   private static long microsFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.microsFromInstant(timestamp.toInstant());
+    return DateTimeUtil.microsFromTimestamp(timestamp.toLocalDateTime());

Review comment:
       Even more surprised to see that, since I run my tests in CET and they are running without problem. What timezone are you using? Are you using stock Hive?
   
   Thanks,
   Peter 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582891498



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    return ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    // Magic to clean cached date format for Hive where the default timezone is used in the cached object
+    ThreadLocal<DateFormat> dynamicDateFormat = dateFormat.get();
+    if (dynamicDateFormat != null) {
+      dynamicDateFormat.remove();
+    }
+
+    // Magic to clean cached local timezone for Hive where the default timezone is stored in the cached object
+    ThreadLocal<TimeZone> dynamicTimeZone = localTimeZone.get();

Review comment:
       nit: maybe we can simplify these a bit
   `Optional.ofNullable(localTimeZone.get()).ifPresent(ThreadLocal::remove)`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r584082880



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/hour/min/sec/nanos to generate the object
   private static int daysFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.daysFromInstant(timestamp.toInstant());
+    return DateTimeUtil.daysFromDate(timestamp.toLocalDateTime().toLocalDate());
   }
 
+  // We have to use the LocalDateTime to get the micros. See the comment above.
   private static long microsFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.microsFromInstant(timestamp.toInstant());
+    return DateTimeUtil.microsFromTimestamp(timestamp.toLocalDateTime());

Review comment:
        Maybe intellij made fun of me, or I messed up with my settings. Will definitely check this out soon.
   
   Thanks!
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582877673



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);
+      formatter.set(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));
+
+      Field localTimeZoneField = DateWritable.class.getDeclaredField("LOCAL_TIMEZONE");
+      localTimeZoneField.setAccessible(true);
+      ThreadLocal<TimeZone> localTimeZone = (ThreadLocal<TimeZone>) localTimeZoneField.get(null);
+      localTimeZone.set(TimeZone.getDefault());
+    } catch (NoSuchFieldException nse) {
+      // Hive2 has thread local cache we have to clean, so the new default timezone is used when formatting
+      // Hive3 we do not have this for Date and Timestamp
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();
+    this.testTables = HiveIcebergStorageHandlerTestUtils.testTables(shell, TestTables.TestTableType.HIVE_CATALOG, temp);
+
+    // Uses spark as an engine so we can detect if we unintentionally try to use any execution engines
+    HiveIcebergStorageHandlerTestUtils.init(shell, testTables, temp, "spark");
+  }
+
+  @After
+  public void after() {
+    shell.stop();
+    TimeZone.setDefault(originalTz);

Review comment:
       Then I think we agree that we should not try to revert the TZ.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582881345



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);
+      formatter.set(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));
+
+      Field localTimeZoneField = DateWritable.class.getDeclaredField("LOCAL_TIMEZONE");
+      localTimeZoneField.setAccessible(true);
+      ThreadLocal<TimeZone> localTimeZone = (ThreadLocal<TimeZone>) localTimeZoneField.get(null);
+      localTimeZone.set(TimeZone.getDefault());
+    } catch (NoSuchFieldException nse) {
+      // Hive2 has thread local cache we have to clean, so the new default timezone is used when formatting
+      // Hive3 we do not have this for Date and Timestamp
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();
+    this.testTables = HiveIcebergStorageHandlerTestUtils.testTables(shell, TestTables.TestTableType.HIVE_CATALOG, temp);
+
+    // Uses spark as an engine so we can detect if we unintentionally try to use any execution engines
+    HiveIcebergStorageHandlerTestUtils.init(shell, testTables, temp, "spark");
+  }
+
+  @After
+  public void after() {
+    shell.stop();
+    TimeZone.setDefault(originalTz);

Review comment:
       Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582352610



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -173,11 +172,11 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
   }
 
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());

Review comment:
       Added extra comments to the methods, and in them I tried to describe how Hive parses literals.
   Hope the comments are help to clarify the changes. If not, feel free to ask. It would be good to have good enough comments in the code, so any newcomer could understand the hows and the whys.
   
   Thanks,
   Peter




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582875060



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);
+      formatter.set(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));
+
+      Field localTimeZoneField = DateWritable.class.getDeclaredField("LOCAL_TIMEZONE");
+      localTimeZoneField.setAccessible(true);
+      ThreadLocal<TimeZone> localTimeZone = (ThreadLocal<TimeZone>) localTimeZoneField.get(null);
+      localTimeZone.set(TimeZone.getDefault());
+    } catch (NoSuchFieldException nse) {
+      // Hive2 has thread local cache we have to clean, so the new default timezone is used when formatting
+      // Hive3 we do not have this for Date and Timestamp
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();
+    this.testTables = HiveIcebergStorageHandlerTestUtils.testTables(shell, TestTables.TestTableType.HIVE_CATALOG, temp);
+
+    // Uses spark as an engine so we can detect if we unintentionally try to use any execution engines
+    HiveIcebergStorageHandlerTestUtils.init(shell, testTables, temp, "spark");
+  }
+
+  @After
+  public void after() {
+    shell.stop();
+    TimeZone.setDefault(originalTz);

Review comment:
       I think we should either reset both the ThreadLocals and default timezone or neither. If we just do one of them, further Hive tests dealing with timezones may see a different timezone inside Hive code because of an older ThreadLocal and the Iceberg code may use the current Java timezone causing false positives. Looking at the test results though, doesn't seem like its actually affecting anything.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] edgarRd commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
edgarRd commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r583966003



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/hour/min/sec/nanos to generate the object
   private static int daysFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.daysFromInstant(timestamp.toInstant());
+    return DateTimeUtil.daysFromDate(timestamp.toLocalDateTime().toLocalDate());
   }
 
+  // We have to use the LocalDateTime to get the micros. See the comment above.
   private static long microsFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.microsFromInstant(timestamp.toInstant());
+    return DateTimeUtil.microsFromTimestamp(timestamp.toLocalDateTime());

Review comment:
       @pvary This change breaks test:
   ```
   ./gradlew clean :iceberg-mr:test --tests org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory
   ...
   > Task :iceberg-mr:test FAILED
   
   org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory > testTimestampType FAILED
       java.lang.AssertionError: expected:<1349154977123456> but was:<1349129777123456>
           at org.junit.Assert.fail(Assert.java:88)
           at org.junit.Assert.failNotEquals(Assert.java:834)
           at org.junit.Assert.assertEquals(Assert.java:118)
           at org.junit.Assert.assertEquals(Assert.java:144)
           at org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory.assertPredicatesMatch(TestHiveIcebergFilterFactory.java:268)
           at org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory.testTimestampType(TestHiveIcebergFilterFactory.java:248)
   
   16 tests completed, 1 failed
   ```
   When run in non-UTC environments. I assume the test may need to change to adjust to the changes being made in https://github.com/apache/iceberg/pull/2278 to handle predicate pushdown for Timestamp.withZone().
   
   I'm surprised this is not caught by the CI checks, but maybe the CI runs in UTC - is there a way that we can run the tests in a few additional Timezones to validate?
   
   Thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582366163



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);
+      formatter.set(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));

Review comment:
       I think we can just use `ThreadLocal.remove()` here
   > Removes the current thread's value for this thread-local variable. If this thread-local variable is subsequently read by the current thread, its value will be reinitialized by invoking its initialValue method,
   
   Same for `LOCAL_TIMEZONE` ThreadLocal too

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);

Review comment:
       Most Iceberg code uses `org.apache.iceberg.common.Dyn*` classes to handle reflection. Can we use that instead?
   
   ```
   DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
           .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
           .defaultAlwaysNull() // it will return a StaticField which returns null instead of NoSuchFieldException on .get
           .buildStatic();
   ```
   
   And `dateFormat.get()` to get the ThreadLocal

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(

Review comment:
       Nit: Remove local variable? We can directly return the list.

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);
+      formatter.set(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));
+
+      Field localTimeZoneField = DateWritable.class.getDeclaredField("LOCAL_TIMEZONE");
+      localTimeZoneField.setAccessible(true);
+      ThreadLocal<TimeZone> localTimeZone = (ThreadLocal<TimeZone>) localTimeZoneField.get(null);
+      localTimeZone.set(TimeZone.getDefault());
+    } catch (NoSuchFieldException nse) {
+      // Hive2 has thread local cache we have to clean, so the new default timezone is used when formatting
+      // Hive3 we do not have this for Date and Timestamp
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();
+    this.testTables = HiveIcebergStorageHandlerTestUtils.testTables(shell, TestTables.TestTableType.HIVE_CATALOG, temp);
+
+    // Uses spark as an engine so we can detect if we unintentionally try to use any execution engines
+    HiveIcebergStorageHandlerTestUtils.init(shell, testTables, temp, "spark");
+  }
+
+  @After
+  public void after() {
+    shell.stop();
+    TimeZone.setDefault(originalTz);

Review comment:
       Should we also be resetting the ThreadLocals here so that other tests which will run in the same JVM will not be affected?

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used

Review comment:
       Nice find! I think this will also help us here https://github.com/apache/iceberg/blob/f356be79ddd178c8a8651723cddf40356651329d/data/src/test/java/org/apache/iceberg/data/TestLocalScan.java#L466




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582449036



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -173,11 +172,11 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
   }
 
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());

Review comment:
       Thanks for the comments! I understand now!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] edgarRd commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
edgarRd commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r584050196



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/hour/min/sec/nanos to generate the object
   private static int daysFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.daysFromInstant(timestamp.toInstant());
+    return DateTimeUtil.daysFromDate(timestamp.toLocalDateTime().toLocalDate());
   }
 
+  // We have to use the LocalDateTime to get the micros. See the comment above.
   private static long microsFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.microsFromInstant(timestamp.toInstant());
+    return DateTimeUtil.microsFromTimestamp(timestamp.toLocalDateTime());

Review comment:
       I'm in PST, if you replace https://github.com/apache/iceberg/blob/master/mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergFilterFactory.java#L240-L242 for:
   ```
   TimeZone defaultTz = TimeZone.getDefault();
       try {
         TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles"));
         UnboundPredicate actual = (UnboundPredicate) HiveIcebergFilterFactory.generateFilterExpression(arg);
         assertPredicatesMatch(expected, actual);
       } finally {
         TimeZone.setDefault(defaultTz);
       }
   ```
   
   to set the TimeZone, you should be able to repro - conversely if I use `"UTC"` instead of `"America/Los_Angeles"` the test pass.
   
   I'm running the unit test out of the master branch, with:
   
   ```
   ./gradlew clean :iceberg-mr:test --tests org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r580077066



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +565,29 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());

Review comment:
       Since we're adding this new test, would it make sense to add a case for less than/greater than too? 
   e.g. `WHERE d_date < '2020-01-23'"`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] edgarRd commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
edgarRd commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r584794312



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/hour/min/sec/nanos to generate the object
   private static int daysFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.daysFromInstant(timestamp.toInstant());
+    return DateTimeUtil.daysFromDate(timestamp.toLocalDateTime().toLocalDate());
   }
 
+  // We have to use the LocalDateTime to get the micros. See the comment above.
   private static long microsFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.microsFromInstant(timestamp.toInstant());
+    return DateTimeUtil.microsFromTimestamp(timestamp.toLocalDateTime());

Review comment:
       Thanks for the fix @pvary - test works now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582891498



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    return ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    // Magic to clean cached date format for Hive where the default timezone is used in the cached object
+    ThreadLocal<DateFormat> dynamicDateFormat = dateFormat.get();
+    if (dynamicDateFormat != null) {
+      dynamicDateFormat.remove();
+    }
+
+    // Magic to clean cached local timezone for Hive where the default timezone is stored in the cached object
+    ThreadLocal<TimeZone> dynamicTimeZone = localTimeZone.get();

Review comment:
       nit: maybe we can simplify these a bit
   `Optional.ofNullable(localTimeZone).ifPresent(ThreadLocal::remove)`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary merged pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary merged pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582882939



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +566,61 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date > '2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-24", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());
+  }
+
+  @Test
+  public void testTimestampQuery() throws IOException {
+    Schema timestampSchema = new Schema(optional(1, "d_ts", Types.TimestampType.withoutZone()));

Review comment:
       Ack!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] edgarRd commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
edgarRd commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r584050196



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/hour/min/sec/nanos to generate the object
   private static int daysFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.daysFromInstant(timestamp.toInstant());
+    return DateTimeUtil.daysFromDate(timestamp.toLocalDateTime().toLocalDate());
   }
 
+  // We have to use the LocalDateTime to get the micros. See the comment above.
   private static long microsFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.microsFromInstant(timestamp.toInstant());
+    return DateTimeUtil.microsFromTimestamp(timestamp.toLocalDateTime());

Review comment:
       I'm in PST, if you replace https://github.com/apache/iceberg/blob/master/mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergFilterFactory.java#L240-L242 for:
   ```
       TimeZone defaultTz = TimeZone.getDefault();
       try {
         TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles"));
         UnboundPredicate actual = (UnboundPredicate) HiveIcebergFilterFactory.generateFilterExpression(arg);
         assertPredicatesMatch(expected, actual);
       } finally {
         TimeZone.setDefault(defaultTz);
       }
   ```
   
   to set the TimeZone, you should be able to repro - conversely if I use `"UTC"` instead of `"America/Los_Angeles"` the test pass.
   
   I'm running the unit test out of the master branch, with:
   
   ```
   ./gradlew clean :iceberg-mr:test --tests org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582351371



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +566,61 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date > '2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-24", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());
+  }
+
+  @Test
+  public void testTimestampQuery() throws IOException {
+    Schema timestampSchema = new Schema(optional(1, "d_ts", Types.TimestampType.withoutZone()));

Review comment:
       Spent my day there, but no solution yet. I think it would merit a different PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582888144



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    return ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    // Magic to clean cached date format for Hive where the default timezone is used in the cached object
+    ThreadLocal<DateFormat> dynamicDateFormat = dateFormat.get();
+    if (dynamicDateFormat != null) {
+      dynamicDateFormat.remove();
+    }
+
+    // Magic to clean cached local timezone for Hive where the default timezone is stored in the cached object
+    ThreadLocal<TimeZone> dynamicTimeZone = localTimeZone.get();
+    if (dynamicTimeZone != null) {
+      dynamicTimeZone.remove();
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();

Review comment:
       Can we use parameterized test in this case? We need to set the default timezone to the one defined by the parameter.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#issuecomment-782246914


   The manifest and residual evaluator expect the following java types for each data type https://github.com/apache/iceberg/blob/efc461862fc93637fe4c0127d69bfb777c87766e/api/src/main/java/org/apache/iceberg/types/Type.java#L29
   
   I think we will need to do something very similar to wrapping the record into an `InternalRecordWrapper` like here https://github.com/apache/iceberg/blob/efc461862fc93637fe4c0127d69bfb777c87766e/data/src/main/java/org/apache/iceberg/data/GenericReader.java#L86
   That should handle conversion between types used by `iceberg-data` v/s the ones used by the evaluators.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582748379



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used

Review comment:
       Yeah, it would be useful there as well. I would be happy to review it if you need help!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582411613



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);
+      formatter.set(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));
+
+      Field localTimeZoneField = DateWritable.class.getDeclaredField("LOCAL_TIMEZONE");
+      localTimeZoneField.setAccessible(true);
+      ThreadLocal<TimeZone> localTimeZone = (ThreadLocal<TimeZone>) localTimeZoneField.get(null);
+      localTimeZone.set(TimeZone.getDefault());
+    } catch (NoSuchFieldException nse) {
+      // Hive2 has thread local cache we have to clean, so the new default timezone is used when formatting
+      // Hive3 we do not have this for Date and Timestamp
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();
+    this.testTables = HiveIcebergStorageHandlerTestUtils.testTables(shell, TestTables.TestTableType.HIVE_CATALOG, temp);
+
+    // Uses spark as an engine so we can detect if we unintentionally try to use any execution engines
+    HiveIcebergStorageHandlerTestUtils.init(shell, testTables, temp, "spark");
+  }
+
+  @After
+  public void after() {
+    shell.stop();
+    TimeZone.setDefault(originalTz);

Review comment:
       Should we also be resetting the ThreadLocals here so that other tests which will run in the same thread after this will not be affected?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r581566575



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +566,61 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date > '2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-24", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());
+  }
+
+  @Test
+  public void testTimestampQuery() throws IOException {

Review comment:
       For me this test case fails regardless of whether I keep the changes in `HiveIcebergFilterFactory` or not. Seems like this test is dependent on the timezone the tests run in. I was only able to make this test case work in UTC timezone, it failed in `America/Los_Angeles` and `Asia/Kolkata`. Can we run this test for multiple timezones? Previously we have uncovered issues after we started breaking the assumption that the code runs in UTC.
   https://github.com/apache/iceberg/blob/fab4a5f2db140fdb132205e78934a145e646758b/orc/src/test/java/org/apache/iceberg/orc/TestExpressionToSearchArgument.java#L117




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582746206



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);

Review comment:
       Thanks for the suggestion!
   Tidied up the code quite a bit.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] edgarRd commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
edgarRd commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r584050838



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/hour/min/sec/nanos to generate the object
   private static int daysFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.daysFromInstant(timestamp.toInstant());
+    return DateTimeUtil.daysFromDate(timestamp.toLocalDateTime().toLocalDate());
   }
 
+  // We have to use the LocalDateTime to get the micros. See the comment above.
   private static long microsFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.microsFromInstant(timestamp.toInstant());
+    return DateTimeUtil.microsFromTimestamp(timestamp.toLocalDateTime());

Review comment:
       Interestingly, if I use `"CET"` it also fails:
   ```
   org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory > testTimestampType FAILED
       java.lang.AssertionError: expected:<1349154977123456> but was:<1349162177123456>
           at org.junit.Assert.fail(Assert.java:88)
           at org.junit.Assert.failNotEquals(Assert.java:834)
           at org.junit.Assert.assertEquals(Assert.java:118)
           at org.junit.Assert.assertEquals(Assert.java:144)
           at org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory.assertPredicatesMatch(TestHiveIcebergFilterFactory.java:266)
           at org.apache.iceberg.mr.hive.TestHiveIcebergFilterFactory.testTimestampType(TestHiveIcebergFilterFactory.java:246)
   
   16 tests completed, 1 failed
   ```
   
   For CET, you can see the difference between the expected value and the actual value is exactly 2 hrs in microseconds (7.2 10^9) - the actual value is ahead of the expected one (which is in UTC)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r581566575



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +566,61 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date > '2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-24", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());
+  }
+
+  @Test
+  public void testTimestampQuery() throws IOException {

Review comment:
       For me this test case fails regardless of whether I keep the changes in `HiveIcebergFilterFactory` or not. Seems like this test is dependent on the timezone the tests run in. I was only able to make this test case work in UTC timezone. Can we run this test for multiple timezones? Previously we have uncovered issues after we started breaking the assumption that the code runs in UTC.
   https://github.com/apache/iceberg/blob/fab4a5f2db140fdb132205e78934a145e646758b/orc/src/test/java/org/apache/iceberg/orc/TestExpressionToSearchArgument.java#L117

##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -173,11 +172,11 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
   }
 
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());

Review comment:
       I am having a hard time understanding what behavioral impact this change has. Can you elaborate a bit? Timestamp issues always tend to be tricky for me to get.

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +566,61 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date > '2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-24", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());
+  }
+
+  @Test
+  public void testTimestampQuery() throws IOException {
+    Schema timestampSchema = new Schema(optional(1, "d_ts", Types.TimestampType.withoutZone()));

Review comment:
       Should we also include tests for `Timestamp.withZone()` along the same lines?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#issuecomment-783534771


   @shardulm94: Could you please review as a committer, so I can merge?
   Thanks,
   Peter


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582412976



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +566,61 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date > '2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-24", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());
+  }
+
+  @Test
+  public void testTimestampQuery() throws IOException {
+    Schema timestampSchema = new Schema(optional(1, "d_ts", Types.TimestampType.withoutZone()));

Review comment:
       Makes sense. Does `Timestamp.withZone()` also have a similar issue? Or is it just a matter of adding more tests?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r584358006



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/hour/min/sec/nanos to generate the object
   private static int daysFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.daysFromInstant(timestamp.toInstant());
+    return DateTimeUtil.daysFromDate(timestamp.toLocalDateTime().toLocalDate());
   }
 
+  // We have to use the LocalDateTime to get the micros. See the comment above.
   private static long microsFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.microsFromInstant(timestamp.toInstant());
+    return DateTimeUtil.microsFromTimestamp(timestamp.toLocalDateTime());

Review comment:
       @edgarRd: Pushed the fix here #2283. Could you please confirm, that this issue is fixed?
   
   Thanks,
   Peter




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582878614



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +566,61 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date > '2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-24", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());
+  }
+
+  @Test
+  public void testTimestampQuery() throws IOException {
+    Schema timestampSchema = new Schema(optional(1, "d_ts", Types.TimestampType.withoutZone()));

Review comment:
       Timestamp.withZone has issues, and non-trivial.
   I suggest to do that in the next PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#issuecomment-782289335


   > I think we will need to do something very similar to wrapping the record into an `InternalRecordWrapper` like here
   > https://github.com/apache/iceberg/blob/efc461862fc93637fe4c0127d69bfb777c87766e/data/src/main/java/org/apache/iceberg/data/GenericReader.java#L86
   
   That works! Thanks for the pointer. Updated the PR


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582411613



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);
+      formatter.set(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));
+
+      Field localTimeZoneField = DateWritable.class.getDeclaredField("LOCAL_TIMEZONE");
+      localTimeZoneField.setAccessible(true);
+      ThreadLocal<TimeZone> localTimeZone = (ThreadLocal<TimeZone>) localTimeZoneField.get(null);
+      localTimeZone.set(TimeZone.getDefault());
+    } catch (NoSuchFieldException nse) {
+      // Hive2 has thread local cache we have to clean, so the new default timezone is used when formatting
+      // Hive3 we do not have this for Date and Timestamp
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();
+    this.testTables = HiveIcebergStorageHandlerTestUtils.testTables(shell, TestTables.TestTableType.HIVE_CATALOG, temp);
+
+    // Uses spark as an engine so we can detect if we unintentionally try to use any execution engines
+    HiveIcebergStorageHandlerTestUtils.init(shell, testTables, temp, "spark");
+  }
+
+  @After
+  public void after() {
+    shell.stop();
+    TimeZone.setDefault(originalTz);

Review comment:
       Should we also be resetting the ThreadLocals here so that other tests which will run in the same JVM after this will not be affected?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r583009037



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    return ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    // Magic to clean cached date format for Hive where the default timezone is used in the cached object
+    ThreadLocal<DateFormat> dynamicDateFormat = dateFormat.get();
+    if (dynamicDateFormat != null) {
+      dynamicDateFormat.remove();
+    }
+
+    // Magic to clean cached local timezone for Hive where the default timezone is stored in the cached object
+    ThreadLocal<TimeZone> dynamicTimeZone = localTimeZone.get();
+    if (dynamicTimeZone != null) {
+      dynamicTimeZone.remove();
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();

Review comment:
       Ok.. so after the "magic" it is working now

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    return ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    // Magic to clean cached date format for Hive where the default timezone is used in the cached object
+    ThreadLocal<DateFormat> dynamicDateFormat = dateFormat.get();
+    if (dynamicDateFormat != null) {
+      dynamicDateFormat.remove();
+    }
+
+    // Magic to clean cached local timezone for Hive where the default timezone is stored in the cached object
+    ThreadLocal<TimeZone> dynamicTimeZone = localTimeZone.get();
+    if (dynamicTimeZone != null) {
+      dynamicTimeZone.remove();
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();
+    this.testTables = HiveIcebergStorageHandlerTestUtils.testTables(shell, TestTables.TestTableType.HIVE_CATALOG, temp);
+
+    // Uses spark as an engine so we can detect if we unintentionally try to use any execution engines
+    HiveIcebergStorageHandlerTestUtils.init(shell, testTables, temp, "spark");
+  }
+
+  @After
+  public void after() {
+    shell.stop();

Review comment:
       Done

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    return ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    // Magic to clean cached date format for Hive where the default timezone is used in the cached object
+    ThreadLocal<DateFormat> dynamicDateFormat = dateFormat.get();
+    if (dynamicDateFormat != null) {
+      dynamicDateFormat.remove();
+    }
+
+    // Magic to clean cached local timezone for Hive where the default timezone is stored in the cached object
+    ThreadLocal<TimeZone> dynamicTimeZone = localTimeZone.get();

Review comment:
       Done, much nicer




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582745799



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);
+      formatter.set(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));

Review comment:
       Good thinking!
   Modified as you suggested and it is much nicer.
   Thanks,
   Peter

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(

Review comment:
       Fixed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582899002



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    return ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    // Magic to clean cached date format for Hive where the default timezone is used in the cached object
+    ThreadLocal<DateFormat> dynamicDateFormat = dateFormat.get();
+    if (dynamicDateFormat != null) {
+      dynamicDateFormat.remove();
+    }
+
+    // Magic to clean cached local timezone for Hive where the default timezone is stored in the cached object
+    ThreadLocal<TimeZone> dynamicTimeZone = localTimeZone.get();
+    if (dynamicTimeZone != null) {
+      dynamicTimeZone.remove();
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();

Review comment:
       If not, we can just add a comment and leave it as it is (create a new shell for each test method)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582886637



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    return ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    // Magic to clean cached date format for Hive where the default timezone is used in the cached object
+    ThreadLocal<DateFormat> dynamicDateFormat = dateFormat.get();
+    if (dynamicDateFormat != null) {
+      dynamicDateFormat.remove();
+    }
+
+    // Magic to clean cached local timezone for Hive where the default timezone is stored in the cached object
+    ThreadLocal<TimeZone> dynamicTimeZone = localTimeZone.get();
+    if (dynamicTimeZone != null) {
+      dynamicTimeZone.remove();
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();

Review comment:
       Shouldn't we create a shell per test suite (w/ `BeforeClass`) like in the other test classes?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582350864



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +566,61 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date > '2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-24", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());
+  }
+
+  @Test
+  public void testTimestampQuery() throws IOException {

Review comment:
       The tests were running for me in CET TZ, but you have a point here. Added a specific test case to test out the things with multiple TZs. Had to do some magic to remove old default timezones from cached objects, but finally it works.
   
   Could you please check again?
   
   Thanks,
   Peter




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582411168



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);

Review comment:
       Most Iceberg code uses `org.apache.iceberg.common.Dyn*` classes to handle reflection. Can we use that instead?
   
   ```
   DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
           .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
           .defaultAlwaysNull() // it will build a StaticField which returns null instead of NoSuchFieldException on .get
           .buildStatic();
   ```
   
   And `dateFormat.get()` to get the ThreadLocal




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r580310248



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -563,6 +565,29 @@ public void testStructOfStructsInTable() throws IOException {
     }
   }
 
+  @Test
+  public void testDateQuery() throws IOException {
+    Schema dateSchema = new Schema(optional(1, "d_date", Types.DateType.get()));
+
+    List<Record> records = TestHelper.RecordsBuilder.newInstance(dateSchema)
+        .add(LocalDate.of(2020, 1, 21))
+        .add(LocalDate.of(2020, 1, 24))
+        .build();
+
+    testTables.createTable(shell, "date_test", dateSchema, fileFormat, records);
+
+    List<Object[]> result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-21'");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date in ('2020-01-21', '2020-01-22')");
+    Assert.assertEquals(1, result.size());
+    Assert.assertEquals("2020-01-21", result.get(0)[0]);
+
+    result = shell.executeStatement("SELECT * from date_test WHERE d_date='2020-01-20'");
+    Assert.assertEquals(0, result.size());

Review comment:
       Added test case for it

##########
File path: mr/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java
##########
@@ -284,8 +285,9 @@ public void close() throws IOException {
       boolean applyResidual = !context.getConfiguration().getBoolean(InputFormatConfig.SKIP_RESIDUAL_FILTERING, false);
 
       if (applyResidual && residual != null && residual != Expressions.alwaysTrue()) {
+        InternalRecordWrapper wrapper = new InternalRecordWrapper(readSchema.asStruct());

Review comment:
       Added comment




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582887243



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    return ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    // Magic to clean cached date format for Hive where the default timezone is used in the cached object
+    ThreadLocal<DateFormat> dynamicDateFormat = dateFormat.get();
+    if (dynamicDateFormat != null) {
+      dynamicDateFormat.remove();
+    }
+
+    // Magic to clean cached local timezone for Hive where the default timezone is stored in the cached object
+    ThreadLocal<TimeZone> dynamicTimeZone = localTimeZone.get();
+    if (dynamicTimeZone != null) {
+      dynamicTimeZone.remove();
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();
+    this.testTables = HiveIcebergStorageHandlerTestUtils.testTables(shell, TestTables.TestTableType.HIVE_CATALOG, temp);
+
+    // Uses spark as an engine so we can detect if we unintentionally try to use any execution engines
+    HiveIcebergStorageHandlerTestUtils.init(shell, testTables, temp, "spark");
+  }
+
+  @After
+  public void after() {
+    shell.stop();

Review comment:
       Shouldn't we stop the shell in `AfterClass`? and just close the session and reset the metastore in `After`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582411168



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);

Review comment:
       Most Iceberg code uses `org.apache.iceberg.common.Dyn*` classes to handle reflection. Can we use that instead?
   
   ```
   DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
           .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
           .buildStatic();
   ```
   
   And `dateFormat.get()` to get the ThreadLocal
   
   We could also use `.defaultAlwaysNull()` on the builder to make it return null instead of `NoSuchFieldException` on `.get()`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582747976



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,165 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+import java.text.DateFormat;
+import java.text.SimpleDateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    Collection<Object[]> testParams = ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+
+    return testParams;
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  private TimeZone originalTz = TimeZone.getDefault();
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    try {
+      // Magic to update cached stuff for Hive where the default timezone is used
+      Field formatterField = TimestampWritable.class.getDeclaredField("threadLocalDateFormat");
+      formatterField.setAccessible(true);
+      ThreadLocal<DateFormat> formatter = (ThreadLocal<DateFormat>) formatterField.get(null);
+      formatter.set(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss"));
+
+      Field localTimeZoneField = DateWritable.class.getDeclaredField("LOCAL_TIMEZONE");
+      localTimeZoneField.setAccessible(true);
+      ThreadLocal<TimeZone> localTimeZone = (ThreadLocal<TimeZone>) localTimeZoneField.get(null);
+      localTimeZone.set(TimeZone.getDefault());
+    } catch (NoSuchFieldException nse) {
+      // Hive2 has thread local cache we have to clean, so the new default timezone is used when formatting
+      // Hive3 we do not have this for Date and Timestamp
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();
+    this.testTables = HiveIcebergStorageHandlerTestUtils.testTables(shell, TestTables.TestTableType.HIVE_CATALOG, temp);
+
+    // Uses spark as an engine so we can detect if we unintentionally try to use any execution engines
+    HiveIcebergStorageHandlerTestUtils.init(shell, testTables, temp, "spark");
+  }
+
+  @After
+  public void after() {
+    shell.stop();
+    TimeZone.setDefault(originalTz);

Review comment:
       Let's talk about this.
   I do not think that I am able to fix all of the places that is needed to have a clean test run in the same JVM after this.
   The best I could do is fixing the things I know of, and I would avoid this to highlight the issues faster.
   Maybe I should even remove the `TimeZone.setDefault(originalTz)` too? I left it there because some logging/reporting might use that
   
   What do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#issuecomment-786600794


   Thanks @shardulm94 and @marton-bod for the review!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r584084172



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/hour/min/sec/nanos to generate the object
   private static int daysFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.daysFromInstant(timestamp.toInstant());
+    return DateTimeUtil.daysFromDate(timestamp.toLocalDateTime().toLocalDate());
   }
 
+  // We have to use the LocalDateTime to get the micros. See the comment above.
   private static long microsFromTimestamp(Timestamp timestamp) {
-    return DateTimeUtil.microsFromInstant(timestamp.toInstant());
+    return DateTimeUtil.microsFromTimestamp(timestamp.toLocalDateTime());

Review comment:
       If you have some time, could you please check that with #2278 the tests run correctly on you side?
   In the meantime I try to find out what is the process of reverting changes.
   
   Thanks,
   Peter




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r580080228



##########
File path: mr/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java
##########
@@ -284,8 +285,9 @@ public void close() throws IOException {
       boolean applyResidual = !context.getConfiguration().getBoolean(InputFormatConfig.SKIP_RESIDUAL_FILTERING, false);
 
       if (applyResidual && residual != null && residual != Expressions.alwaysTrue()) {
+        InternalRecordWrapper wrapper = new InternalRecordWrapper(readSchema.asStruct());

Review comment:
       Can we add a comment here why this wrapper is needed? I think the class name doesn't tell you much and it does not have javadoc either to explain its purpose. What do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582896948



##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  @Parameters(name = "timezone={0}")
+  public static Collection<Object[]> parameters() {
+    return ImmutableList.of(
+        new String[] {"America/New_York"},
+        new String[] {"Asia/Kolkata"},
+        new String[] {"UTC/Greenwich"}
+    );
+  }
+
+  private TestHiveShell shell;
+
+  private TestTables testTables;
+
+  @Parameter(0)
+  public String timezoneString;
+
+  @Rule
+  public TemporaryFolder temp = new TemporaryFolder();
+
+  @Before
+  public void before() throws IOException, IllegalAccessException {
+    TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));
+
+    // Magic to clean cached date format for Hive where the default timezone is used in the cached object
+    ThreadLocal<DateFormat> dynamicDateFormat = dateFormat.get();
+    if (dynamicDateFormat != null) {
+      dynamicDateFormat.remove();
+    }
+
+    // Magic to clean cached local timezone for Hive where the default timezone is stored in the cached object
+    ThreadLocal<TimeZone> dynamicTimeZone = localTimeZone.get();
+    if (dynamicTimeZone != null) {
+      dynamicTimeZone.remove();
+    }
+
+    this.shell = HiveIcebergStorageHandlerTestUtils.shell();

Review comment:
       Good question. Not sure if calling `TimeZone.setDefault(TimeZone.getTimeZone(timezoneString));` will take effect if the shell/HS2 has been already started, but could be worth a try




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] shardulm94 commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
shardulm94 commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582877370



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/min/sec/nanos to generate the object

Review comment:
       Typo: Missed the `hour` in the comment 

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();

Review comment:
       Minor nit: These can be `private static final`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2254: Hive: Fix predicate pushdown for Date

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2254:
URL: https://github.com/apache/iceberg/pull/2254#discussion_r582884150



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergFilterFactory.java
##########
@@ -172,15 +171,24 @@ private static BigDecimal hiveDecimalToBigDecimal(HiveDecimalWritable hiveDecima
     return hiveDecimalWritable.getHiveDecimal().bigDecimalValue().setScale(hiveDecimalWritable.scale());
   }
 
+  // Hive uses `java.sql.Date.valueOf(lit.toString());` to convert a literal to Date
+  // Which uses `java.util.Date()` internally to create the object and that uses the TimeZone.getDefaultRef()
+  // To get back the expected date we have to use the LocalDate which gets rid of the TimeZone misery as it uses
+  // the year/month/day to generate the object
   private static int daysFromDate(Date date) {
-    return DateTimeUtil.daysFromInstant(Instant.ofEpochMilli(date.getTime()));
+    return DateTimeUtil.daysFromDate(date.toLocalDate());
   }
 
+  // Hive uses `java.sql.Timestamp.valueOf(lit.toString());` to convert a literal to Timestamp
+  // Which again uses `java.util.Date()` internally to create the object which uses the TimeZone.getDefaultRef()
+  // To get back the expected timestamp we have to use the LocalDateTime which gets rid of the TimeZone misery
+  // as it uses the year/month/day/min/sec/nanos to generate the object

Review comment:
       Fixed

##########
File path: mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerTimezone.java
##########
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr.hive;
+
+import java.io.IOException;
+import java.text.DateFormat;
+import java.time.LocalDate;
+import java.time.LocalDateTime;
+import java.util.Collection;
+import java.util.List;
+import java.util.TimeZone;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.common.DynFields;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.mr.TestHelper;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+import org.apache.iceberg.types.Types;
+import org.junit.After;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.optional;
+import static org.junit.runners.Parameterized.Parameter;
+import static org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestHiveIcebergStorageHandlerTimezone {
+  DynFields.StaticField<ThreadLocal<DateFormat>> dateFormat = DynFields.builder()
+      .hiddenImpl(TimestampWritable.class, "threadLocalDateFormat")
+      .defaultAlwaysNull()
+      .buildStatic();
+
+  DynFields.StaticField<ThreadLocal<TimeZone>> localTimeZone = DynFields.builder()
+      .hiddenImpl(DateWritable.class, "LOCAL_TIMEZONE")
+      .defaultAlwaysNull()
+      .buildStatic();

Review comment:
       Good catch!
   Fixed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org