You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/01/04 15:23:58 UTC

[GitHub] [iceberg] marton-bod opened a new pull request #2025: Set Input and OutputFormat in the hive meta hook

marton-bod opened a new pull request #2025:
URL: https://github.com/apache/iceberg/pull/2025


   Impala retrieves the Input and OutputFormat classes directly from the HMS properties, and not from the StorageHandler. Therefore, we need to explicitly set these values during the Hive metahook operations, so that we have those present even when the table was created using Hive DDL. The analogous change in the Iceberg API has been added in this commit: https://github.com/apache/iceberg/commit/45caac4927fedc8313aba79d49f5911eccee2474
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2025: Set Input and OutputFormat in the hive meta hook

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2025:
URL: https://github.com/apache/iceberg/pull/2025#issuecomment-754823123


   Thanks for solving this, @marton-bod and @pvary!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #2025: Set Input and OutputFormat in the hive meta hook

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #2025:
URL: https://github.com/apache/iceberg/pull/2025


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2025: Set Input and OutputFormat in the hive meta hook

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2025:
URL: https://github.com/apache/iceberg/pull/2025#discussion_r551849275



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##########
@@ -114,6 +114,10 @@ public void preCreateTable(org.apache.hadoop.hive.metastore.api.Table hmsTable)
       hmsTable.getParameters().put(InputFormatConfig.EXTERNAL_TABLE_PURGE, "TRUE");
     }
 
+    // For a table created by Hive DDL to be readable by Impala, we need the Input and OutputFormat set explicitly
+    hmsTable.getSd().setInputFormat(HiveIcebergInputFormat.class.getCanonicalName());
+    hmsTable.getSd().setOutputFormat(HiveIcebergOutputFormat.class.getCanonicalName());
+

Review comment:
       Discussed this with @marton-bod.
   
   After @boroknagyz's PR (#1751) `hive.engine.enabled` drives the value for these properties for HiveCatalog tables.
   
   For Hive tables above Iceberg tables stored in other Iceberg Catalogs these values are still not set. So it might worth to move this change for the non-HiveCatalog codepath here:
   https://github.com/apache/iceberg/blob/05752965f5eb453d895db44fb2d072d270646644/mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java#L78-L81
   
   This would allow Impala to read tables created through Hive but stored in non-HiveCatalogs.
   
   For tables created from Hive and stored in HiveCatalog we set `hive.engine.enabled` to `true` here:
   https://github.com/apache/iceberg/blob/05752965f5eb453d895db44fb2d072d270646644/mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java#L133-L137
   
   Setting `hive.engine.enabled` for other Iceberg Catalogs does not have any affect, so I think we are good there.
   
   Thanks,
   Peter




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2025: Set Input and OutputFormat in the hive meta hook

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2025:
URL: https://github.com/apache/iceberg/pull/2025#discussion_r551458894



##########
File path: mr/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##########
@@ -114,6 +114,10 @@ public void preCreateTable(org.apache.hadoop.hive.metastore.api.Table hmsTable)
       hmsTable.getParameters().put(InputFormatConfig.EXTERNAL_TABLE_PURGE, "TRUE");
     }
 
+    // For a table created by Hive DDL to be readable by Impala, we need the Input and OutputFormat set explicitly
+    hmsTable.getSd().setInputFormat(HiveIcebergInputFormat.class.getCanonicalName());
+    hmsTable.getSd().setOutputFormat(HiveIcebergOutputFormat.class.getCanonicalName());
+

Review comment:
       These are going to be overwritten by `HiveTableOperations` depending on whether Hive is enabled. I think the right fix is to set the property to enable Hive on the table when creating an Iceberg table with the meta hook. @pvary, what do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org