You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/01/28 23:24:52 UTC

[GitHub] [iceberg] rdblue opened a new pull request #2180: Fix flaky delete test

rdblue opened a new pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180


   I think this should fix the Hive metastore errors that have made `TestCopyOnWriteDelete` flaky. I think this was accidentally caused by https://github.com/apache/iceberg/commit/6731211eaf746c6a4abfe71386ef53172a3fc137. That commit stopped using the constructors for the Hive and Hadoop catalogs and started loading them dynamically. It also didn't use the same default pool size, 2, and added a hard-coded 5.
   
   I think the test is flaky because the first Hive catalog test runs with 5 clients and shares the same metastore instance with the later catalog test for `spark_catalog`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-771605789






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-770062242


   > Or even better can we make the tests/code resilient to changes in pool size?
   
   The simplest solution is to increase the max number of connections in the test metastore but we decided not to do so that we can catch any potential connection leaks (it did help a couple of times).
   
   > I wonder if we should change HiveClientPool to use the default too?
   
   Well, I am not sure we should default `HiveClientPool` with values from `CatalogProperties` as the pool does not technically belong to the catalog. That said, I think we should fix `HiveCatalog` constructor that just accepts a Hadoop conf. Also, I would remove the constructor from `HiveClientPool` that does not accept the pool size (it is package private and only used in one place I am suggesting to update). That way, we ensure `HiveCatalog` will always set a correct default pool size.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi merged pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
aokolnychyi merged pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-771098371


   > I don't think that's the relevant change. The delete tests use a registered catalog through SQL and don't use the `IcebergSource`. The problem was a different pool size because of the way the catalog is now constructed.
   
   Yeah, you are right. The `IcebergSource` is not used there.
   
   The key change in the cache still might be relevant in tests/uses where `IcebergSource` is used, but definitely not here.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-770066258


   > I think the main difference is that HiveCatalogs uses an internal CATALOG_CACHE so if you try to get the Catalog for the same URI multiple times you will get back the same instance
   
   I don't think that the changes to 2.4 apply to this path. In Spark 3, catalog instances have always been created once per session and are managed by Spark. Spark is the catalog "caching" in this case, and this hasn't changed because it doesn't go through the `IcebergSource`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-770064458


   @pvary, could you elaborate a bit more? I am not sure I got it. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-770062383


   What do you think, @rymurr?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-771034458


   I don't think that's the relevant change. The delete tests use a registered catalog through SQL and don't use the `IcebergSource`. The problem was a different pool size because of the way the catalog is now constructed.
   
   Also, `CustomCatalogs` has its own catalog cache so that catalogs are not loaded more than once by name.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-770687222


   > @pvary, could you elaborate a bit more? I am not sure I got it.
   
   By my understanding the relevant change in #1875 is [this](https://github.com/apache/iceberg/commit/6731211eaf746c6a4abfe71386ef53172a3fc137#diff-e3082ef0c407c6d7644f453cd24a426eedc21fcdcb1fc501beeffb07b3070135L141-R138):
   ```
   -      HiveCatalog hiveCatalog = HiveCatalogs.loadCatalog(conf);
   -      TableIdentifier tableIdentifier = TableIdentifier.parse(path.get());
   -      return hiveCatalog.loadTable(tableIdentifier);
   +      return CustomCatalogs.table(lazySparkSession(), path.get());
   ```
   
   Am I right when assuming that this change replaces `HiveCatalogs.loadCatalog(conf)` with `CatalogUtil.buildIcebergCatalog(name, options, conf)`, which ultimately calls `CatalogUtil.loadCatalog(catalogImpl, name, options, conf)`?
   
   One important difference between the 2 methods is that `HiveCatalogs.loadCatalog(conf)` comes with a `CATALOG_CACHE`, so if you call loadCatalog with the same thrift uri then we reuse the previously created HiveCatalog instance. This is good as this reuses the HiveClientPool instance in that catalog, and also could be problematic since if there is any difference in the configuration then it will not be applied to the new pool. OTOH `CatalogUtil` always creates a new HiveCatalog instance and with it a HiveClientPool instance.
   
   My assumption was that the tests are creating multiple Catalog instances or they are not closing the Catalog instances correctly and this is the problem which causes the exhaustion of the Metastore thread pool.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-771605789


   
   > That said, I think we should fix `HiveCatalog` constructor that just accepts a Hadoop conf. 
   
   agreed, will take a look at that
   
   > Also, I would remove the constructor from `HiveClientPool` that does not accept the pool size (it is package private and only used in one place I am suggesting to update). 
   
   also agreed


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-769529102


   Thanks for taking the time to investigate, @rdblue! 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-769503822


   Tests passed the first time through. Re-running.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-771611181


   Maybe this question is for a new issue but: What do people think about the `HiveCatalog` lifecycle in general? It has been bothering me after I ran into an issue where `HiveCatalog` was being GC-ed, hence closing the `HiveClientPool` and breaking some cached `HiveTableOperations`. I've looked into a more formal resource ownership model but it breaks a lot of unit tests as the lifetime of `HiveCatalog` isn't consistent across tests. If it is valuable to standardise the lifetime of a `HiveCatalog` in a unit test and define the lifetime in a typical user application we can discuss and I post a patch. Thoughts?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-769701554


   I think the main difference is that `HiveCatalogs` uses an internal `CATALOG_CACHE` so if you try to get the Catalog for the same URI multiple times you will get back the same instance. OTOH with the new implementation there is no cache, and if you request a catalog you will get back a new instance every time - the question is if this is an intended behavior or we can share the instances.
   
   CC: @lcspinter


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on pull request #2180: Fix flaky delete test

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #2180:
URL: https://github.com/apache/iceberg/pull/2180#issuecomment-769679259


   Thanks a lot for fixing this @rdblue . I just went over the original patch to try and figure out what I missed. It looks like I took the `5` from: https://github.com/apache/iceberg/blob/master/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java#L42 rather than ` CatalogProperties.CLIENT_POOL_SIZE_DEFAULT`. 
   
   I wonder if we should change `HiveClientPool` to use the default too? Or even better can we make the tests/code resilient to changes in pool size?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org