You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/10/01 15:11:06 UTC

[GitHub] [hive] marton-bod opened a new pull request #2696: HIVE-25587: Disable Iceberg table migration for unsupported source fi…

marton-bod opened a new pull request #2696:
URL: https://github.com/apache/hive/pull/2696


   Throw exception early for unsupported source table file formats.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on a change in pull request #2696: HIVE-25587: Disable Iceberg table migration for unsupported source fi…

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2696:
URL: https://github.com/apache/hive/pull/2696#discussion_r721048383



##########
File path: iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergMigration.java
##########
@@ -133,6 +135,39 @@ public void testRollbackMigratePartitionedBucketedHiveTableToIceberg() throws TE
     validateMigrationRollback(tableName);
   }
 
+  @Test
+  public void testMigrationFailsForUnsupportedSourceFileFormat() {
+    // enough to test once
+    Assume.assumeTrue(fileFormat == FileFormat.ORC && isVectorized &&
+        testTableType == TestTables.TestTableType.HIVE_CATALOG);
+    String tableName = "tbl_unsupported";
+    List<String> formats = ImmutableList.of("TEXTFILE", "JSONFILE", "RCFILE", "SEQUENCEFILE");
+    formats.forEach(format -> {
+      shell.executeStatement("CREATE EXTERNAL TABLE " +  tableName + " (a int) STORED AS " + format + " " +
+          testTables.locationForCreateTableSQL(TableIdentifier.of("default", tableName)));
+      shell.executeStatement("INSERT INTO " + tableName + " VALUES (1), (2), (3)");
+      AssertHelpers.assertThrows("Migrating a " + format + " table to Iceberg should have thrown an exception.",
+          IllegalArgumentException.class, "Cannot convert hive table to iceberg with input format: ",
+          () -> shell.executeStatement("ALTER TABLE " + tableName + " SET TBLPROPERTIES " +
+              "('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler')"));
+      shell.executeStatement("DROP TABLE " + tableName);
+    });
+  }
+
+  @Test
+  public void testMigrationFailsForManagedTable() {
+    // enough to test once
+    Assume.assumeTrue(fileFormat == FileFormat.ORC && isVectorized &&

Review comment:
       Do we want to have a test for unsupported fileformat? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] marton-bod commented on a change in pull request #2696: HIVE-25587: Disable Iceberg table migration for unsupported source fi…

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2696:
URL: https://github.com/apache/hive/pull/2696#discussion_r721108666



##########
File path: iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergMigration.java
##########
@@ -133,6 +135,39 @@ public void testRollbackMigratePartitionedBucketedHiveTableToIceberg() throws TE
     validateMigrationRollback(tableName);
   }
 
+  @Test
+  public void testMigrationFailsForUnsupportedSourceFileFormat() {
+    // enough to test once
+    Assume.assumeTrue(fileFormat == FileFormat.ORC && isVectorized &&
+        testTableType == TestTables.TestTableType.HIVE_CATALOG);
+    String tableName = "tbl_unsupported";
+    List<String> formats = ImmutableList.of("TEXTFILE", "JSONFILE", "RCFILE", "SEQUENCEFILE");
+    formats.forEach(format -> {
+      shell.executeStatement("CREATE EXTERNAL TABLE " +  tableName + " (a int) STORED AS " + format + " " +
+          testTables.locationForCreateTableSQL(TableIdentifier.of("default", tableName)));
+      shell.executeStatement("INSERT INTO " + tableName + " VALUES (1), (2), (3)");
+      AssertHelpers.assertThrows("Migrating a " + format + " table to Iceberg should have thrown an exception.",
+          IllegalArgumentException.class, "Cannot convert hive table to iceberg with input format: ",
+          () -> shell.executeStatement("ALTER TABLE " + tableName + " SET TBLPROPERTIES " +
+              "('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler')"));
+      shell.executeStatement("DROP TABLE " + tableName);
+    });
+  }
+
+  @Test
+  public void testMigrationFailsForManagedTable() {
+    // enough to test once
+    Assume.assumeTrue(fileFormat == FileFormat.ORC && isVectorized &&

Review comment:
       I think `testMigrationFailsForUnsupportedSourceFileFormat` already provides tests for unsupported source table file formats. Or do you mean having a test where the source table is managed AND its format is unsupported at the same time?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on a change in pull request #2696: HIVE-25587: Disable Iceberg table migration for unsupported source fi…

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2696:
URL: https://github.com/apache/hive/pull/2696#discussion_r720433389



##########
File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##########
@@ -234,16 +240,16 @@ public void preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
       icebergTable = IcebergTableUtil.getTable(conf, catalogProperties);
     } catch (NoSuchTableException nte) {
       context.getProperties().put(MIGRATE_HIVE_TO_ICEBERG, "true");
-      // If the iceberg table does not exist, and the hms table is external and not temporary and not acid
-      // we will create it in commitAlterTable
-      StorageDescriptor sd = hmsTable.getSd();
-      canMigrateHiveTable = MetaStoreUtils.isExternalTable(hmsTable) && !hmsTable.isTemporary() &&
-          !AcidUtils.isTransactionalTable(hmsTable);
-      if (!canMigrateHiveTable) {
-        throw new MetaException("Converting non-external, temporary or transactional hive table to iceberg " +
-            "table is not allowed.");
-      }
+      // If the iceberg table does not exist, and the hms table is:
+      // - external

Review comment:
       Maybe move this to the javadoc of the method 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] marton-bod commented on a change in pull request #2696: HIVE-25587: Disable Iceberg table migration for unsupported source fi…

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2696:
URL: https://github.com/apache/hive/pull/2696#discussion_r721291904



##########
File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##########
@@ -95,14 +98,17 @@
       AlterTableType.ADDCOLS, AlterTableType.REPLACE_COLUMNS, AlterTableType.RENAME_COLUMN,
       AlterTableType.ADDPROPS, AlterTableType.DROPPROPS, AlterTableType.SETPARTITIONSPEC,
       AlterTableType.UPDATE_COLUMNS);
+  private static final List<String> MIGRATION_ALLOWED_SOURCE_FORMATS = Arrays.stream(FileFormat.values())
+      .filter(f -> !f.equals(FileFormat.METADATA))
+      .map(FileFormat::name).map(String::toLowerCase).collect(Collectors.toList());

Review comment:
       Good point, will change this to list the 3 supported types only and explicitly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pvary commented on a change in pull request #2696: HIVE-25587: Disable Iceberg table migration for unsupported source fi…

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2696:
URL: https://github.com/apache/hive/pull/2696#discussion_r721110740



##########
File path: iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergMigration.java
##########
@@ -133,6 +135,39 @@ public void testRollbackMigratePartitionedBucketedHiveTableToIceberg() throws TE
     validateMigrationRollback(tableName);
   }
 
+  @Test
+  public void testMigrationFailsForUnsupportedSourceFileFormat() {
+    // enough to test once
+    Assume.assumeTrue(fileFormat == FileFormat.ORC && isVectorized &&
+        testTableType == TestTables.TestTableType.HIVE_CATALOG);
+    String tableName = "tbl_unsupported";
+    List<String> formats = ImmutableList.of("TEXTFILE", "JSONFILE", "RCFILE", "SEQUENCEFILE");
+    formats.forEach(format -> {
+      shell.executeStatement("CREATE EXTERNAL TABLE " +  tableName + " (a int) STORED AS " + format + " " +
+          testTables.locationForCreateTableSQL(TableIdentifier.of("default", tableName)));
+      shell.executeStatement("INSERT INTO " + tableName + " VALUES (1), (2), (3)");
+      AssertHelpers.assertThrows("Migrating a " + format + " table to Iceberg should have thrown an exception.",
+          IllegalArgumentException.class, "Cannot convert hive table to iceberg with input format: ",
+          () -> shell.executeStatement("ALTER TABLE " + tableName + " SET TBLPROPERTIES " +
+              "('storage_handler'='org.apache.iceberg.mr.hive.HiveIcebergStorageHandler')"));
+      shell.executeStatement("DROP TABLE " + tableName);
+    });
+  }
+
+  @Test
+  public void testMigrationFailsForManagedTable() {
+    // enough to test once
+    Assume.assumeTrue(fileFormat == FileFormat.ORC && isVectorized &&

Review comment:
       Nope - I did not know we have `testMigrationFailsForUnsupportedSourceFileFormat`. Please disregard my comment




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] szlta commented on a change in pull request #2696: HIVE-25587: Disable Iceberg table migration for unsupported source fi…

Posted by GitBox <gi...@apache.org>.
szlta commented on a change in pull request #2696:
URL: https://github.com/apache/hive/pull/2696#discussion_r721290259



##########
File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##########
@@ -95,14 +98,17 @@
       AlterTableType.ADDCOLS, AlterTableType.REPLACE_COLUMNS, AlterTableType.RENAME_COLUMN,
       AlterTableType.ADDPROPS, AlterTableType.DROPPROPS, AlterTableType.SETPARTITIONSPEC,
       AlterTableType.UPDATE_COLUMNS);
+  private static final List<String> MIGRATION_ALLOWED_SOURCE_FORMATS = Arrays.stream(FileFormat.values())
+      .filter(f -> !f.equals(FileFormat.METADATA))
+      .map(FileFormat::name).map(String::toLowerCase).collect(Collectors.toList());

Review comment:
       I think it would probably be better to list the formats instead i.e. Lists.newArrayList(FileFormat.PARQUET, .. etc). That way it's more readable and safer from a very unlikely scenario a new enum value is added in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] marton-bod commented on a change in pull request #2696: HIVE-25587: Disable Iceberg table migration for unsupported source fi…

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2696:
URL: https://github.com/apache/hive/pull/2696#discussion_r720709682



##########
File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##########
@@ -234,16 +240,16 @@ public void preAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable, E
       icebergTable = IcebergTableUtil.getTable(conf, catalogProperties);
     } catch (NoSuchTableException nte) {
       context.getProperties().put(MIGRATE_HIVE_TO_ICEBERG, "true");
-      // If the iceberg table does not exist, and the hms table is external and not temporary and not acid
-      // we will create it in commitAlterTable
-      StorageDescriptor sd = hmsTable.getSd();
-      canMigrateHiveTable = MetaStoreUtils.isExternalTable(hmsTable) && !hmsTable.isTemporary() &&
-          !AcidUtils.isTransactionalTable(hmsTable);
-      if (!canMigrateHiveTable) {
-        throw new MetaException("Converting non-external, temporary or transactional hive table to iceberg " +
-            "table is not allowed.");
-      }
+      // If the iceberg table does not exist, and the hms table is:
+      // - external

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] marton-bod merged pull request #2696: HIVE-25587: Disable Iceberg table migration for unsupported source fi…

Posted by GitBox <gi...@apache.org>.
marton-bod merged pull request #2696:
URL: https://github.com/apache/hive/pull/2696


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org