You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/19 12:27:20 UTC

[GitHub] [iceberg] grbinho opened a new pull request, #6223: Use provided glue catalog id in defaultWarehouseLocation

grbinho opened a new pull request, #6223:
URL: https://github.com/apache/iceberg/pull/6223

   `defaultWarehouseLocation` call to `glue.getDatabase` did not set catalog id while building a request.  
   This results in `software.amazon.awssdk.services.glue.model.EntityNotFoundException: Database my_database not found.` error when table is being created in another AWS account.
   
   More details are in https://github.com/apache/iceberg/issues/6215  
   
   @singhpk234 
   If you can take a look, it would be appreciated.  
   I did not add any tests for it since I saw that similar tests for other methods do not exist.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] grbinho commented on pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
grbinho commented on PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1321539524

   @ajantha-bhat Can you please do another quick check. I changed how `AwsProperties` are initialized in the test. New way looks cleaner to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6223: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on code in PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#discussion_r1027619399


##########
aws/src/test/java/org/apache/iceberg/aws/glue/TestGlueCatalog.java:
##########
@@ -144,6 +144,32 @@ public void testDefaultWarehouseLocationDbUri() {
     Assert.assertEquals("s3://bucket2/db/table", location);
   }
 
+  @Test
+  public void testDefaultWarehouseLocationCustomCatalogId() {

Review Comment:
   I presume it is a single account. 
   The Integration testcase can be executed by exporting the mentioned environment variables (https://github.com/apache/iceberg/pull/4855)
   
   Thanks for the clarification. may be no need of catalog id for these tests. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] grbinho commented on pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
grbinho commented on PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1323723030

   @ajantha-bhat @singhpk234 
   Do you need anything else from me to get this merged?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1356906835

   Running CI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1321577620

   > @ajantha-bhat Can you please do another quick check. I changed how AwsProperties are initialized in the test. New way looks cleaner to me.
   
   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
amogh-jahagirdar commented on code in PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#discussion_r1031954363


##########
aws/src/test/java/org/apache/iceberg/aws/glue/TestGlueCatalog.java:
##########
@@ -144,6 +144,32 @@ public void testDefaultWarehouseLocationDbUri() {
     Assert.assertEquals("s3://bucket2/db/table", location);
   }
 
+  @Test
+  public void testDefaultWarehouseLocationCustomCatalogId() {
+    GlueCatalog catalogWithCustomCatalogId = new GlueCatalog();
+    String catalogId = "myCatalogId";
+    AwsProperties awsProperties = new AwsProperties();
+    awsProperties.setGlueCatalogId(catalogId);
+    catalogWithCustomCatalogId.initialize(
+        CATALOG_NAME,
+        WAREHOUSE_PATH + "/",
+        awsProperties,
+        glue,
+        LockManagers.defaultLockManager(),
+        null,
+        ImmutableMap.of());
+
+    Mockito.doReturn(
+            GetDatabaseResponse.builder()
+                .database(Database.builder().name("db").locationUri("s3://bucket2/db").build())
+                .build())
+        .when(glue)
+        .getDatabase(Mockito.any(GetDatabaseRequest.class));
+    catalogWithCustomCatalogId.defaultWarehouseLocation(TableIdentifier.of("db", "table"));
+    Mockito.verify(glue)
+        .getDatabase(Mockito.argThat((GetDatabaseRequest req) -> req.catalogId() == catalogId));

Review Comment:
   Since catalogId is a string can we do the equality check via .equals?
   
   ```
   Mockito.argThat((GetDatabaseRequest req) -> catalogId.equals(req.catalogId()))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] grbinho commented on pull request #6223: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
grbinho commented on PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1321089167

   > LGTM ! Thanks @grbinho .
   > 
   > You can add a UT for now it by mocking GetDatabase request and making sure catalogId passed in it. can ref this [existing test](https://github.com/apache/iceberg/blob/master/aws/src/test/java/org/apache/iceberg/aws/glue/TestGlueCatalog.java#L136)
   > 
   > cc @jackye1995
   
   I added a unit test with `AwsProperties.GLUE_CATALOG_ID` property set in `AwsProperties`.   
   This seemed like the way to go with this one since test is supposed to validate that property is actually used in the method.  
   I'm not that experienced with Java and Mockito, so feedback on the correct way to do this is welcome!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] grbinho commented on a diff in pull request #6223: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
grbinho commented on code in PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#discussion_r1027618072


##########
aws/src/test/java/org/apache/iceberg/aws/glue/TestGlueCatalog.java:
##########
@@ -144,6 +144,32 @@ public void testDefaultWarehouseLocationDbUri() {
     Assert.assertEquals("s3://bucket2/db/table", location);
   }
 
+  @Test
+  public void testDefaultWarehouseLocationCustomCatalogId() {

Review Comment:
   Are there multiple AWS accounts available for integration tests?  
   I could not find that setup either in `AwsIntegTestUtil` or `TestAssumeRoleAwsClientFactory` (as and obvious potential cross account example). 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] grbinho commented on pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
grbinho commented on PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1328579748

   > I wonder if we shall also add catalogId to these two requests (may be in another PR)
   
   For `createGlueTempTableIfNecessary` I'm not sure that is necessary since it's talking about Lake Formation.  
   Usually with Lake Formation cross account access is handled by Lake Formation (and resource shares).
   In normal deployments tables would always be created in your "local" catalog and then shared with others and/or central catalog.
   
   That said, I'm not that familiar with the code base enough to be sure it's unnecessary.  
   
   Maybe @amogh-jahagirdar knows?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] grbinho commented on pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
grbinho commented on PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1352975547

   @JonasJ-ap @ajantha-bhat 
   Hi guys, can you advise how we can move this MR forward?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1358021713

   Thanks, @grbinho!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
rdblue merged PR #6223:
URL: https://github.com/apache/iceberg/pull/6223


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on a diff in pull request #6223: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on code in PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#discussion_r1027585464


##########
aws/src/test/java/org/apache/iceberg/aws/glue/TestGlueCatalog.java:
##########
@@ -144,6 +144,32 @@ public void testDefaultWarehouseLocationDbUri() {
     Assert.assertEquals("s3://bucket2/db/table", location);
   }
 
+  @Test
+  public void testDefaultWarehouseLocationCustomCatalogId() {

Review Comment:
   The integration tests in `TestGlueCatalogNamespace` also don't set the catalogId, do we need to set it in those files also? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] ajantha-bhat commented on pull request #6223: AWS: Use provided glue catalog id in defaultWarehouseLocation

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on PR #6223:
URL: https://github.com/apache/iceberg/pull/6223#issuecomment-1323728710

   cc: @Fokko, @rdblue, @RussellSpitzer   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org