You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/09/16 13:02:52 UTC

[GitHub] [hive] pvary commented on a change in pull request #2644: HIVE-25529: Test reading/writing V2 tables with delete files

pvary commented on a change in pull request #2644:
URL: https://github.com/apache/hive/pull/2644#discussion_r710095882



##########
File path: iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergTestUtils.java
##########
@@ -299,4 +311,68 @@ public static void validateDataWithSQL(TestHiveShell shell, String tableName, Li
       }
     }
   }
+
+  /**
+   * @param table The table to create the delete file for
+   * @param deleteFilePath The path where the delete file should be created, relative to the table location root
+   * @param equalityFields List of field names that should play a role in the equality check
+   * @param fileFormat The file format that should be used for writing out the delete file
+   * @param rowsToDelete The rows that should be deleted. It's enough to fill out the fields that are relevant for the
+   *                     equality check, as listed in equalityFields, the rest of the fields are ignored
+   * @return The DeleteFile created
+   * @throws IOException If there is an error during DeleteFile write
+   */
+  public static DeleteFile createEqualityDeleteFile(Table table, String deleteFilePath, List<String> equalityFields,
+      FileFormat fileFormat, List<Record> rowsToDelete) throws IOException {
+    List<Integer> equalityFieldIds = equalityFields.stream()
+        .map(id -> table.schema().findField(id).fieldId())
+        .collect(Collectors.toList());
+    Schema eqDeleteRowSchema = table.schema().select(equalityFields.toArray(new String[]{}));
+
+    FileAppenderFactory<Record> appenderFactory = new GenericAppenderFactory(table.schema(), table.spec(),
+        ArrayUtil.toIntArray(equalityFieldIds), eqDeleteRowSchema, null);
+    EncryptedOutputFile outputFile = table.encryption().encrypt(HadoopOutputFile.fromPath(
+        new org.apache.hadoop.fs.Path(table.location(), deleteFilePath), new Configuration()));
+
+    PartitionKey part = new PartitionKey(table.spec(), eqDeleteRowSchema);
+    part.partition(rowsToDelete.get(0));
+    EqualityDeleteWriter<Record> eqWriter = appenderFactory.newEqDeleteWriter(outputFile, fileFormat, part);
+    try (EqualityDeleteWriter<Record> writer = eqWriter) {
+      writer.deleteAll(rowsToDelete);
+    }
+    return eqWriter.toDeleteFile();
+  }
+
+  /**
+   * @param table The table to create the delete file for
+   * @param deleteFilePath The path where the delete file should be created, relative to the table location root
+   * @param fileFormat The file format that should be used for writing out the delete file
+   * @param partitionValues A map of partition values (partitionKey=partitionVal, ...) to be used for the delete file
+   * @param deletes The list of position deletes, each containing the data file path, the position of the row in the
+   *                data file and the row itself that should be deleted
+   * @return The DeleteFile created
+   * @throws IOException If there is an error during DeleteFile write
+   */
+  public static DeleteFile createPositionalDeleteFile(Table table, String deleteFilePath, FileFormat fileFormat,

Review comment:
       Thx for the investigation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org