You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "frankliee (via GitHub)" <gi...@apache.org> on 2023/04/13 12:15:59 UTC

[GitHub] [iceberg] frankliee commented on pull request #7310: Hive: Clean up expired metastore clients

frankliee commented on PR #7310:
URL: https://github.com/apache/iceberg/pull/7310#issuecomment-1506863502

   > > Caffeine cache will not always call removelistener by default, so an extra scheduler is required to invoke removelistener.
   > 
   > From how I understand how Caffeine does this is that it performs the cleanup whenever the cache is being accessed or modified. Below is an excerpt from the Javadoc:
   > 
   > > If expireAfter, expireAfterWrite, or expireAfterAccess is requested then entries may be evicted on each cache modification, on occasional cache accesses, or on calls to Cache.cleanUp. A scheduler(Scheduler) may be specified to provide prompt removal of expired entries rather than waiting until activity triggers the periodic maintenance. Expired entries may be counted by Cache.estimatedSize(), but will never be visible to read or write operations.
   > 
   > That means the removal listener will be called. And this can be seen by modifying the test to
   > 
   > ```
   >     Awaitility.await()
   >         .atMost(10, TimeUnit.SECONDS)
   >         .untilAsserted(
   >             () -> {
   >               Assert.assertTrue(clientPool1.isClosed());
   >               Assert.assertTrue(clientPool2.isClosed());
   >             });
   > ```
   > 
   > The question for this PR should rather be how **critical** it is that the removal is performed immediately vs waiting until activity triggers the periodic maintenance.
   
   
   
   I understand your concerns, and this PR chooses to call removal immediately for the following two reasons.
   
   1) Caffeine's default stratey will postpone removal as much as possible to reduce overhead.  In this Iceberg, removal will be call for each 5 min by default, so the overhead is acceptable. 
   
   2) Iceberg has a conf called "client.pool.cache.eviction-interval-ms" to ensure the time of eviction, but the uncertainty of Caffeine default stratey will make it strange to predict. This PR makes it meaningful.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org