You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/09/17 06:23:03 UTC

[GitHub] [iceberg] xloya opened a new pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

xloya opened a new pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135


   
   **1. What this MR does:**
   When the equality delete set in DeleteFilter saves a piece of data read from an `equal delete` parquet file in memory, it will call the `get()` method of the `GenericRecord` class for data comparison.If the table schema definition of the current position of `GenericRecord` is `DATE/TIMESTAMP/TIME` type, it will be converted to `LocalDate/LocalDateTime/LocalTime` type in memory. When the `get()` method of `GenericRecord` is compared, the java class `Integer/Long/Long` defined in the `Types` enum of the `DATE/TIMESTAMP/TIME` type is passed in. Because the type data of` LocalDate/LocalDateTime/LocalTime` is not an instance of `Integer/Long/Long` type, an IllegalStateException with Not an intance of xxx is thrown.
   
   The pr first converts the data to the `InternalRecordWrapper` before saving the data in the equality delete set, and calls its `get()` method during comparison. If it is a type that needs to be converted such as `DATE/TIMESTAMP/TIME`, use the `transforms` parameter to convert, otherwise directly Output raw data type.
   **2. Which issue this PR fixes:**
   Fix issue #3119.
   
   @rdblue @chenjunjiedada @openinx Could you help to take a look at this PR? Thanks in advance!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r712316560



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -330,6 +407,15 @@ private StructLikeSet rowSetWithoutIds(int... idsToRemove) {
     return set;
   }
 
+  protected StructLikeSet rowSetWithoutIds2(int... idsToRemove) {

Review comment:
       Instead of duplicating logic, please update the existing method so that the table is passed in. If you want to minimize changes, just add an override with the old method signature.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r712314926



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -51,18 +54,32 @@
       Types.NestedField.required(2, "data", Types.StringType.get())
   );
 
+  public static final Schema DATE_SCHEMA = new Schema(
+          Types.NestedField.required(1, "dt", Types.DateType.get()),
+          Types.NestedField.required(2, "data", Types.StringType.get()),
+          Types.NestedField.required(3, "id", Types.IntegerType.get())
+  );
+
   // Partition spec used to create tables
   public static final PartitionSpec SPEC = PartitionSpec.builderFor(SCHEMA)
       .bucket("data", 16)
       .build();
 
+  public static final PartitionSpec DATE_SPEC = PartitionSpec.builderFor(DATE_SCHEMA)
+          .day("dt")
+          .build();
+
   @Rule
   public TemporaryFolder temp = new TemporaryFolder();
 
-  private String tableName = null;
+  protected String tableName = null;
+  protected String tableName2 = null;

Review comment:
       Please use more descriptive names that `tableName2`. Maybe `dateTableName`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r712321387



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       Why are you using `InternalRecordWrapper` here and for the expected set? I think that `StructLikeSet` should work without it, right? If not, then we need to update the set that is produced by `rowSet` to do the wrapping since future tests may not know to wrap.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714195431



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -330,6 +420,15 @@ private StructLikeSet rowSetWithoutIds(int... idsToRemove) {
     return set;
   }
 
+  private StructLikeSet rowSetWithoutIds(Set<Integer> idSet, Table iTable, List<Record> recordList) {
+    StructLikeSet set = StructLikeSet.create(iTable.schema().asStruct());
+    recordList.stream()
+        .filter(row -> !idSet.contains(row.getField("id")))
+        .map(record -> new InternalRecordWrapper(iTable.schema().asStruct()).wrap(record))
+        .forEach(set::add);
+    return set;
+  }

Review comment:
       I think the other `rowSetWithoutIds` method should be rewritten to call this one since it is more generic. That will also avoid the wrapper problem in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714436234



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       @rdblue OK. The data returned by the original `rowSetWithoutIds()`method of `DeleteReadTests` and the original `rowSet()` method of `TestGenericReaderDeletes` is the `GenericRecord` class. Calling the `Assert.assertEquals()` method will compare the two `StructLikeSets`. If it is `DATE/TIMESTAMP` type of the data, it will throw an exception of `Not a instance of Integer/Long`, so it needs to be converted to the `InternalRecordWrapper` class for comparison




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714439493



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       Maybe i should encapsulate the assertion method? Or overwrite rowSetWithoutIds() method for different wrapper?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r716277880



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -321,11 +411,12 @@ private StructLikeSet selectColumns(StructLikeSet rows, String... columns) {
     return set;
   }
 
-  private StructLikeSet rowSetWithoutIds(int... idsToRemove) {
+  private StructLikeSet rowSetWithoutIds(Table iTable, List<Record> recordList, int... idsToRemove) {

Review comment:
       Since you're passing the table in, can you make this a `static` method? You may also be able to use the name `table` instead of `iTable` if it is `static`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r712315687



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -51,18 +54,32 @@
       Types.NestedField.required(2, "data", Types.StringType.get())
   );
 
+  public static final Schema DATE_SCHEMA = new Schema(
+          Types.NestedField.required(1, "dt", Types.DateType.get()),
+          Types.NestedField.required(2, "data", Types.StringType.get()),
+          Types.NestedField.required(3, "id", Types.IntegerType.get())

Review comment:
       Indentation is off in this file. Please match the existing indentation, where continuation indents are 4 spaces.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r712313775



##########
File path: data/src/main/java/org/apache/iceberg/data/DeleteFilter.java
##########
@@ -134,9 +134,13 @@ protected long pos(T record) {
 
       Iterable<CloseableIterable<Record>> deleteRecords = Iterables.transform(deletes,
           delete -> openDeletes(delete, deleteSchema));
-      StructLikeSet deleteSet = Deletes.toEqualitySet(
-          // copy the delete records because they will be held in a set
-          CloseableIterable.transform(CloseableIterable.concat(deleteRecords), Record::copy),
+
+      // copy the delete records because they will be held in a set
+      CloseableIterable<Record> records = CloseableIterable.transform(CloseableIterable.concat(deleteRecords),
+          Record::copy);

Review comment:
       Nit: it would be more readable to add a newline so that the `concat` argument and `Record::copy` are indented the same.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r711552241



##########
File path: data/src/main/java/org/apache/iceberg/data/DeleteFilter.java
##########
@@ -134,9 +134,12 @@ protected long pos(T record) {
 
       Iterable<CloseableIterable<Record>> deleteRecords = Iterables.transform(deletes,
           delete -> openDeletes(delete, deleteSchema));
-      StructLikeSet deleteSet = Deletes.toEqualitySet(
-          // copy the delete records because they will be held in a set

Review comment:
       fix done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r712323642



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);

Review comment:
       I don't see how this delete file would be applied. The partition you're writing it to is `Row.of(0)`, which ends up being `1970-01-01`. That should prevent the delete file from being found when planning, so none of the deletes would be applied.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714193467



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -330,6 +420,15 @@ private StructLikeSet rowSetWithoutIds(int... idsToRemove) {
     return set;
   }
 
+  private StructLikeSet rowSetWithoutIds(Set<Integer> idSet, Table iTable, List<Record> recordList) {

Review comment:
       Nit: the name `iTable` is odd. Why not just `table`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714497856



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       I think either the conversion at the `assertion` or the conversion at `rowSetWithoutIds()` and `rowSet()` is fine, but it must be converted.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714436234



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       @rdblue OK. The data returned by the original `rowSetWithoutIds()`method of `DeleteReadTests` and the original `rowSet()` method of `TestGenericReaderDeletes` returns the `GenericRecord` class. In `TestFlinkInputFormatReaderDeletes` the `rowSet()` method returns `RowDataWrapper` class, and in `TestSparkReaderDeletes` the `rowSet()` method returns `SparkStructLike` class.Calling the `Assert.assertEquals()` method will compare the two `StructLikeSets`. If it is `DATE/TIMESTAMP` type of the data of `GenericRecord`, it will compare(GenericRecord, GenericRecord) and throw an exception of `Not a instance of Integer/Long`. If actual data class is  `RowDataWrapper` or `SparkStructLike`, it will compare(GenericRecord, RowDataWrapper/SparkStructLike) and their class mismatch. so they both need to be converted to the `InternalRecordWrapper` class for comparison




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714436234



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       @rdblue OK. The data returned by the original `rowSetWithoutIds()`method of `DeleteReadTests` and the original `rowSet()` method of `TestGenericReaderDeletes` returns the `GenericRecord` class. In `TestFlinkInputFormatReaderDeletes` the `rowSet()` method returns `RowDataWrapper` class, and in `TestSparkReaderDeletes` the `rowSet()` method returns `SparkStructLike` class.Calling the `Assert.assertEquals()` method will compare the two `StructLikeSets`. If it is `DATE/TIMESTAMP` type of the data of `GenericRecord`, it will `compare(GenericRecord, GenericRecord)` and throw an exception of `Not a instance of Integer/Long`. If actual data class is  `RowDataWrapper` or `SparkStructLike`, it will `compare(GenericRecord, RowDataWrapper/SparkStructLike)` and their class mismatch. so they both need to be converted to the `InternalRecordWrapper` class for comparison




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-923253801


   Thanks, @xloya! The implementation looks correct to me, but there are a few things to fix in the tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714429343



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -330,6 +420,15 @@ private StructLikeSet rowSetWithoutIds(int... idsToRemove) {
     return set;
   }
 
+  private StructLikeSet rowSetWithoutIds(Set<Integer> idSet, Table iTable, List<Record> recordList) {

Review comment:
       I originally used table, but code style check will report hidden field errors.Maybe there is a conflict with a property called table. iTable means iceberg table.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r716312372



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -321,11 +411,12 @@ private StructLikeSet selectColumns(StructLikeSet rows, String... columns) {
     return set;
   }
 
-  private StructLikeSet rowSetWithoutIds(int... idsToRemove) {
+  private StructLikeSet rowSetWithoutIds(Table iTable, List<Record> recordList, int... idsToRemove) {

Review comment:
       A helpful suggestion, it works, thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714190945



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       @xloya, can you briefly describe what you did to address this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya removed a comment on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya removed a comment on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-922814287


   @rdblue Hi, I had fixed the unit-test, could you please run the CI again? thx!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714436234



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       @rdblue OK. The original `rowSetWithoutIds()`method of `DeleteReadTests` and the original `rowSet()` method of `TestGenericReaderDeletes` returns both the `GenericRecord` class's data. In `TestFlinkInputFormatReaderDeletes` the `rowSet()` method returns `RowDataWrapper` class's data, and in `TestSparkReaderDeletes` the `rowSet()` method returns `SparkStructLike` class's data.Calling the `Assert.assertEquals()` method will compare the two `StructLikeSets`. If it is `DATE/TIMESTAMP` type of the data of `GenericRecord`, it will `compare(GenericRecord, GenericRecord)` and throw an exception of `Not a instance of Integer/Long`. If actual data class is  `RowDataWrapper` or `SparkStructLike`, it will `compare(GenericRecord, RowDataWrapper/SparkStructLike)` and their class mismatch. so they both need to be converted to the `InternalRecordWrapper` class for comparison




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-929746947


   Thanks for your time, @rdblue !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714439493



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       Maybe i should encapsulate the assertion method?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r716277794



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);

Review comment:
       Makes sense. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-929746947


   Thanks for your time, @rdblue !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714439493



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       Maybe i should encapsulate the assertion method? Or overwrite rowSetWithoutIds() method for different wrapper?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714427583



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);

Review comment:
       @rdblue Previously, the data file was written using `Row.of(0)`, which means that all data is written to the `1970-01-01` partition. So all the equality delete data before is also written to `Row.of(0)`. Now it is changed that the data is written to the corresponding date time partition, and then the equality delete data is also written to the corresponding partition.
   If keep the data file written to the partition as `1970-01-01`, but the partition where the equality delete data is actually written is changed to `2021-09-01`, it will not work.
   Found spec on official website:
   [http://iceberg.apache.org/spec/](url)
   `Like data files, delete files are tracked by partition. In general, a delete file must be applied to older data files with the same partition; see Scan Planning for details. Column metrics can be used to determine whether a delete file’s rows overlap the contents of a data file or a scan range.`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-922503139


   Running CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jiasheng55 commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
jiasheng55 commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r711542561



##########
File path: data/src/main/java/org/apache/iceberg/data/DeleteFilter.java
##########
@@ -134,9 +134,12 @@ protected long pos(T record) {
 
       Iterable<CloseableIterable<Record>> deleteRecords = Iterables.transform(deletes,
           delete -> openDeletes(delete, deleteSchema));
-      StructLikeSet deleteSet = Deletes.toEqualitySet(
-          // copy the delete records because they will be held in a set

Review comment:
       This comment is missing.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya edited a comment on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya edited a comment on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-923611185


   Thanks @rdblue for your particular review! I will fix them according to your suggestion these days


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714429343



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -330,6 +420,15 @@ private StructLikeSet rowSetWithoutIds(int... idsToRemove) {
     return set;
   }
 
+  private StructLikeSet rowSetWithoutIds(Set<Integer> idSet, Table iTable, List<Record> recordList) {

Review comment:
       I originally used table, but code style check would report a hidden field error.Maybe there is a conflict with a property called table. iTable means iceberg table.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714436234



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);
+
+    table2.newRowDelta()
+            .addDeletes(eqDeletes)
+            .commit();
+
+    StructLikeSet expected = rowSetWithoutIds2(1, 2, 3);
+    StructLikeSet expectedSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(expectedSet, expected.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));
+
+    StructLikeSet actual = rowSet(tableName2, table2, "*");
+    StructLikeSet actualSet = StructLikeSet.create(table2.schema().asStruct());
+
+    Iterables.addAll(actualSet, actual.stream()
+            .map(record -> new InternalRecordWrapper(table2.schema().asStruct()).wrap(record))
+            .collect(Collectors.toList()));

Review comment:
       @rdblue OK. The data returned by the original `rowSetWithoutIds()`method of `DeleteReadTests` and the original `rowSet()` method of `TestGenericReaderDeletes` is the `GenericRecord` class. And in `TestFlinkInputFormatReaderDeletes` the `rowSet()` method returns `RowDataWrapper`, in `TestSparkReaderDeletes` the `rowSet()` method returns `SparkStructLike`.Calling the `Assert.assertEquals()` method will compare the two `StructLikeSets`. If it is `DATE/TIMESTAMP` type of the data, it will throw an exception of `Not a instance of Integer/Long`, so it needs to be converted to the `InternalRecordWrapper` class for comparison




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r712313320



##########
File path: data/src/main/java/org/apache/iceberg/data/DeleteFilter.java
##########
@@ -134,9 +134,13 @@ protected long pos(T record) {
 
       Iterable<CloseableIterable<Record>> deleteRecords = Iterables.transform(deletes,
           delete -> openDeletes(delete, deleteSchema));
-      StructLikeSet deleteSet = Deletes.toEqualitySet(
-          // copy the delete records because they will be held in a set
-          CloseableIterable.transform(CloseableIterable.concat(deleteRecords), Record::copy),
+
+      // copy the delete records because they will be held in a set
+      CloseableIterable<Record> records = CloseableIterable.transform(CloseableIterable.concat(deleteRecords),
+          Record::copy);
+
+      StructLikeSet deleteSet = Deletes.toEqualitySet(CloseableIterable.transform(records,
+          record -> new InternalRecordWrapper(deleteSchema.asStruct()).wrap(record)),

Review comment:
       We prefer line wrapping that keeps arguments to the same method aligned rather than aligning arguments to different levels. Here, the lambda to create an `InternalRecordWrapper` is an argument to `transform`, but it is aligned with `deleteSchema.asStruct()` that is an argument to the outer `toEqualitySet` call.
   
   Instead, can you add a newline for each argument to `toEqualitySet`? If the line with the lambda is too long, then you can also add a newline for it that it indented from the start of the line with `transform`, so it is clear that it is an argument to `transform` and not `toEqualitySet`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r712316670



##########
File path: flink/src/test/java/org/apache/iceberg/flink/SimpleDataUtil.java
##########
@@ -78,6 +80,11 @@ private SimpleDataUtil() {
       Types.NestedField.optional(2, "data", Types.StringType.get())
   );
 
+  public static final Schema DATE_SCHEMA = new Schema(
+          Types.NestedField.required(1, "dt", Types.DateType.get()),
+          Types.NestedField.optional(2, "data", Types.StringType.get())

Review comment:
       Indentation is off here as well.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-929392594






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-929392594


   Looks good to me. Running CI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-929498841


   Thanks, @xloya! Merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-923611185


   Thanks @rdblue for your particularly review! I will fix them according to your suggestion these days


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#issuecomment-922814287


   @rdblue Hi, I had fixed the unit-test, could you please run the CI again? thx!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714427583



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);

Review comment:
       @rdblue Previously, the data file was written using Row.of(0), which means that all data is written to the 1970-01-01 partition. So all the equal delete data before is also written to Row.of(0). Now it is changed that the data is written to the corresponding date time partition, and then the equality delete data is also written to the corresponding partition.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714427583



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);

Review comment:
       @rdblue Previously, the data file was written using `Row.of(0)`, which means that all data is written to the `1970-01-01` partition. So all the equal delete data before is also written to `Row.of(0)`. Now it is changed that the data is written to the corresponding date time partition, and then the equality delete data is also written to the corresponding partition.
   If keep the data file written to the partition as `1970-01-01`, but the partition where the equality delete data is actually written is changed to `2021-09-01`, it will not work.
   Found spec on official website:
   [http://iceberg.apache.org/spec/](url)
   `Like data files, delete files are tracked by partition. In general, a delete file must be applied to older data files with the same partition; see Scan Planning for details. Column metrics can be used to determine whether a delete file’s rows overlap the contents of a data file or a scan range.`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] xloya commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
xloya commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714439735



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -330,6 +420,15 @@ private StructLikeSet rowSetWithoutIds(int... idsToRemove) {
     return set;
   }
 
+  private StructLikeSet rowSetWithoutIds(Set<Integer> idSet, Table iTable, List<Record> recordList) {
+    StructLikeSet set = StructLikeSet.create(iTable.schema().asStruct());
+    recordList.stream()
+        .filter(row -> !idSet.contains(row.getField("id")))
+        .map(record -> new InternalRecordWrapper(iTable.schema().asStruct()).wrap(record))
+        .forEach(set::add);
+    return set;
+  }

Review comment:
       ok i'll rewrite the origin rowSetWithoutIds() method




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3135: Data: Fix equality delete set save DATE/TIMESTAMP type of data throw IllegalStateException

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3135:
URL: https://github.com/apache/iceberg/pull/3135#discussion_r714190622



##########
File path: data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java
##########
@@ -125,6 +166,42 @@ public void testEqualityDeletes() throws IOException {
     Assert.assertEquals("Table should contain expected rows", expected, actual);
   }
 
+  @Test
+  public void testEqualityDateDeletes() throws IOException {
+    initTable2();
+
+    Schema deleteRowSchema = table2.schema().select("*");
+    Record dataDelete = GenericRecord.create(deleteRowSchema);
+    List<Record> dataDeletes = Lists.newArrayList(
+            dataDelete.copy("dt", LocalDate.parse("2021-09-01"), "data", "a", "id", 1),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-02"), "data", "b", "id", 2),
+            dataDelete.copy("dt", LocalDate.parse("2021-09-03"), "data", "c", "id", 3)
+    );
+
+    DeleteFile eqDeletes = FileHelpers.writeDeleteFile(
+            table2, Files.localOutput(temp.newFile()), Row.of(0), dataDeletes, deleteRowSchema);

Review comment:
       @xloya, what was the resolution to this? Were the deletes not applied? Was the test incorrect and not working?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org