You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/09/14 13:26:42 UTC

[GitHub] [iceberg] gaborkaszab opened a new pull request, #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

gaborkaszab opened a new pull request, #5763:
URL: https://github.com/apache/iceberg/pull/5763

   When HiveCatalog is used for creating a table the table owner property is taken from a system property of the process running Iceberg. However, in some clients (e.g. Apache Impala) it's not the owner of the process who runs the create table operation and as a result not the desired owner gets written to HMS.
   For instance, in Impala's case the owner of Catalogd (the process that makes the Iceberg API calls) was set as the table owner and not the user who ran the CREATE TABLE statement.
   
   I changed HiveCatalog to accept an owner parameter from the clients. Note, other Catalog implementations aren't affected because the issue is related to HMS only.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
gaborkaszab commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1276248993

   > Is there a way to do this without a change in the core?
   
   Thanks for taking a look, @pvary ! After changing my patch as Fokko suggested there is still a line of change in core: Adding the constant for the new tableproperty. Do you think this can remain or should I find another place in hive-metastore?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
Fokko commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1265518178

   In PyIceberg we had a similar concern. We ended up by fetching username from the properties: https://github.com/apache/iceberg/blob/master/python/pyiceberg/catalog/hive.py#L317 Similar to what we do for location (`METADATA_LOCATION_PROP`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
pvary commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1286484365

   Thanks for the PR @gaborkaszab! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a diff in pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
pvary commented on code in PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#discussion_r996999861


##########
hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java:
##########
@@ -247,12 +241,29 @@ public void testReplaceTxnBuilder() throws Exception {
     }
   }
 
+  @Test
+  public void testCreateTableWithOwner() throws Exception {
+    Schema schema = getTestSchema();
+    PartitionSpec spec = PartitionSpec.builderFor(schema).bucket("data", 16).build();
+    TableIdentifier tableIdent = TableIdentifier.of(DB_NAME, "tbl");
+    String location = temp.newFolder("tbl").toString();
+    String owner = "some_owner";
+    ImmutableMap<String, String> properties =

Review Comment:
   Does this also write the username to the HMS table property list?
   Is this what we expect?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
gaborkaszab commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1285361971

   > > I think the broken build here is due to a flaky test in TestSnapshotUtil. [62eb74c](https://github.com/apache/iceberg/commit/62eb74c5571c856bae5b32cacb6fb1ec4f380da9) this is meant to fix it but wasn't part of this build. Is there a way to re-execute the tests on this PR? (without pushing any new code to this review)
   > 
   > I would just change the comment of the last commit and force push to the branch
   
   thx. a rebase and a re-run helped.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on a diff in pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
gaborkaszab commented on code in PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#discussion_r998231170


##########
hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java:
##########
@@ -247,12 +241,29 @@ public void testReplaceTxnBuilder() throws Exception {
     }
   }
 
+  @Test
+  public void testCreateTableWithOwner() throws Exception {
+    Schema schema = getTestSchema();
+    PartitionSpec spec = PartitionSpec.builderFor(schema).bucket("data", 16).build();
+    TableIdentifier tableIdent = TableIdentifier.of(DB_NAME, "tbl");
+    String location = temp.newFolder("tbl").toString();
+    String owner = "some_owner";
+    ImmutableMap<String, String> properties =

Review Comment:
   > Does this also write the username to the HMS table property list? Is this what we expect?
   
   Good point! This was also added ad an HMS table property but I think it's not needed. I changed the code to only set the owner of the table but not to add the owner field as a table property to HMS.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on pull request #5763: Hive: Set the Table owner on table creation

Posted by GitBox <gi...@apache.org>.
gaborkaszab commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1286625972

   Thanks for reviewing @pvary and @Fokko!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
gaborkaszab commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1276245441

   > In PyIceberg we had a similar concern. We ended up by fetching username from the properties: https://github.com/apache/iceberg/blob/master/python/pyiceberg/catalog/hive.py#L317 Similar to what we do for location (`METADATA_LOCATION_PROP`).
   
   Thanks for taking a look, @Fokko ! For some reason I figured that using a table property to provide the owner is not the proper way to go. However, I changed my patch to use a table property as you suggested. This frankly seems a way simpler implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
pvary commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1285082782

   > I think the broken build here is due to a flaky test in TestSnapshotUtil. [62eb74c](https://github.com/apache/iceberg/commit/62eb74c5571c856bae5b32cacb6fb1ec4f380da9) this is meant to fix it but wasn't part of this build. Is there a way to re-execute the tests on this PR? (without pushing any new code to this review)
   
   I would just change the comment of the last commit and force push to the branch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on a diff in pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
gaborkaszab commented on code in PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#discussion_r999451973


##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:
##########
@@ -460,6 +462,11 @@ private void setHmsTableParameters(
               String hmsKey = ICEBERG_TO_HMS_TRANSLATION.getOrDefault(key, key);
               parameters.put(hmsKey, value);
             });
+
+    // 'hms_table_owner' is only used for setting the owner of the table during table creation. No
+    // need to add it as a table property to HMS.
+    parameters.remove(TableProperties.HMS_TABLE_OWNER);

Review Comment:
   Nothing specific. Moved this to a filter() before the foreach()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary merged pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
pvary merged PR #5763:
URL: https://github.com/apache/iceberg/pull/5763


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
pvary commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1263119026

   Is there a way to do this without a change in the core?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
gaborkaszab commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1262383290

   This didn't get much attention I'm afraid. @pvary would you mind taking a quick look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a diff in pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
pvary commented on code in PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#discussion_r999336319


##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:
##########
@@ -460,6 +462,11 @@ private void setHmsTableParameters(
               String hmsKey = ICEBERG_TO_HMS_TRANSLATION.getOrDefault(key, key);
               parameters.put(hmsKey, value);
             });
+
+    // 'hms_table_owner' is only used for setting the owner of the table during table creation. No
+    // need to add it as a table property to HMS.
+    parameters.remove(TableProperties.HMS_TABLE_OWNER);

Review Comment:
   Any reason why you did not put this into the previous foreach?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gaborkaszab commented on pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
gaborkaszab commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1284114749

   I think the broken build here is due to a flaky test in TestSnapshotUtil. https://github.com/apache/iceberg/commit/62eb74c5571c856bae5b32cacb6fb1ec4f380da9 this is meant to fix it but wasn't part of this build.
   Is there a way to re-execute the tests on this PR? (without pushing any new code to this review)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #5763: Core, Hive-metastore: Table owner can be incorrect in HMS

Posted by GitBox <gi...@apache.org>.
pvary commented on PR #5763:
URL: https://github.com/apache/iceberg/pull/5763#issuecomment-1285079618

   @gaborkaszab: Could you please check the failed test? Otherwise LGTM +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org