You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/04/01 14:40:02 UTC

[GitHub] [iceberg] marton-bod opened a new pull request #2407: Hive: Synchronize equivalent HMS and Iceberg properties

marton-bod opened a new pull request #2407:
URL: https://github.com/apache/iceberg/pull/2407


   This is a PR for the proposal outlined in: https://github.com/apache/iceberg/pull/2367
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on a change in pull request #2407: Hive: Synchronize equivalent HMS and Iceberg properties

Posted by GitBox <gi...@apache.org>.
marton-bod commented on a change in pull request #2407:
URL: https://github.com/apache/iceberg/pull/2407#discussion_r608586651



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
##########
@@ -86,6 +90,28 @@
       .impl(HiveMetaStoreClient.class, "alter_table",
           String.class, String.class, Table.class, EnvironmentContext.class)
       .build();
+  private static final BiMap<String, String> ICEBERG_TO_HMS_TRANSLATION = ImmutableBiMap.of(
+      // gc.enabled in Iceberg and external.table.purge in Hive are meant to do the same things but with different names
+      GC_ENABLED, "external.table.purge"
+  );
+
+
+  /**
+   * Provides key translation where necessary between Iceberg and HMS props. This translation is needed because some
+   * properties control the same behaviour but are named differently in Iceberg and Hive. Therefore changes to these
+   * property pairs should be synchronized.
+   *
+   * Example: Deleting data files upon DROP TABLE is enabled using gc.enabled=true in Iceberg and
+   * external.table.purge=true in Hive. Hive and Iceberg users are unaware of each other's control flags, therefore
+   * inconsistent behaviour can occur from e.g. a Hive user's point of view if external.table.purge=true is set on the
+   * HMS table but gc.enabled=false is set on the Iceberg table, resulting in no data file deletion.
+   *
+   * @param hmsProp The HMS property that should be translated to Iceberg property
+   * @return Iceberg property equivalent to the hmsProp. If no such translation exists, the original hmsProp is returned
+   */
+  public static String translateToIcebergProp(String hmsProp) {
+    return ICEBERG_TO_HMS_TRANSLATION.inverse().getOrDefault(hmsProp, hmsProp);

Review comment:
       It does not really have any cost. On the first ever call to `inverse()`, it creates the `Inverse` object (and then caches it) which is just a wrapper on the original data structure, which reimplements the `get()` method to query the the same data in the opposite direction.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary merged pull request #2407: Hive: Synchronize equivalent HMS and Iceberg properties

Posted by GitBox <gi...@apache.org>.
pvary merged pull request #2407:
URL: https://github.com/apache/iceberg/pull/2407


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2407: Hive: Synchronize equivalent HMS and Iceberg properties

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2407:
URL: https://github.com/apache/iceberg/pull/2407#issuecomment-816646939


   Thanks for the PR @marton-bod!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2407: Hive: Synchronize equivalent HMS and Iceberg properties

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2407:
URL: https://github.com/apache/iceberg/pull/2407#discussion_r608361767



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
##########
@@ -86,6 +90,28 @@
       .impl(HiveMetaStoreClient.class, "alter_table",
           String.class, String.class, Table.class, EnvironmentContext.class)
       .build();
+  private static final BiMap<String, String> ICEBERG_TO_HMS_TRANSLATION = ImmutableBiMap.of(
+      // gc.enabled in Iceberg and external.table.purge in Hive are meant to do the same things but with different names
+      GC_ENABLED, "external.table.purge"
+  );
+
+
+  /**
+   * Provides key translation where necessary between Iceberg and HMS props. This translation is needed because some
+   * properties control the same behaviour but are named differently in Iceberg and Hive. Therefore changes to these
+   * property pairs should be synchronized.
+   *
+   * Example: Deleting data files upon DROP TABLE is enabled using gc.enabled=true in Iceberg and
+   * external.table.purge=true in Hive. Hive and Iceberg users are unaware of each other's control flags, therefore
+   * inconsistent behaviour can occur from e.g. a Hive user's point of view if external.table.purge=true is set on the
+   * HMS table but gc.enabled=false is set on the Iceberg table, resulting in no data file deletion.
+   *
+   * @param hmsProp The HMS property that should be translated to Iceberg property
+   * @return Iceberg property equivalent to the hmsProp. If no such translation exists, the original hmsProp is returned
+   */
+  public static String translateToIcebergProp(String hmsProp) {
+    return ICEBERG_TO_HMS_TRANSLATION.inverse().getOrDefault(hmsProp, hmsProp);

Review comment:
       How costly  is the inverse call?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on pull request #2407: Hive: Synchronize equivalent HMS and Iceberg properties

Posted by GitBox <gi...@apache.org>.
marton-bod commented on pull request #2407:
URL: https://github.com/apache/iceberg/pull/2407#issuecomment-811955112


   @aokolnychyi @rdblue @pvary Can you please review this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org