You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/03 02:33:15 UTC

[GitHub] [iceberg] lvyanquan opened a new pull request, #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

lvyanquan opened a new pull request, #6111:
URL: https://github.com/apache/iceberg/pull/6111

   As what https://github.com/apache/iceberg/pull/3543 has introduced, 'cache.expiration-interval-ms' option allows user to configure caching policy.
   But this parameter has not been added in FlinkCatalog yet, This pr is aiming to do that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1014937530


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;

Review Comment:
   address it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1013585878


##########
flink/v1.14/flink/src/test/java/org/apache/iceberg/flink/TestFlinkCatalogTablePartitions.java:
##########
@@ -18,14 +18,13 @@
  */
 package org.apache.iceberg.flink;
 
-import static org.apache.iceberg.flink.FlinkCatalogFactory.CACHE_ENABLED;

Review Comment:
   Because CACHE_ENABLED of FlinkCatalogFactory was removed, and this line used string from org.apache.iceberg.CatalogProperties.
   ```
   config.put(CatalogProperties.CACHE_ENABLED, String.valueOf(cacheEnabled));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1020938017


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   > But Flink has a separate [configuration page](https://iceberg.apache.org/docs/latest/flink/#catalog-configuration), so redefining is also acceptable.
   
   I had the same thought. Here is what is more intuitive to me.
   * if the boolean flag is false, caching is disabled
   * if the boolean flag is true
     * if  cacheExpirationIntervalMs > 0, caching automatically expired after the value
     * if <=0, caching never expire automatically. this is the difference. don'y disable caching with value 0 because boolean flag serves the purpose already.
   
   What are you takes? @lvyanquan @hililiwei @pvary 
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1014763819


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;

Review Comment:
   we can use `PropertyUtil` here too, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1016130047


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   This PR replicates the SparkCatalog behavior, which is fine by me.
   
   > - If cache.expiration-interval-ms is zero, caching will be turned off entirely (irrespective of the cache-enabled flag)
   > - If users pass in a value <= -1, (with cache-enabled=true), then caching will be enabled and cache expiration will never happen (i.e. entries will persist unless explicitly refreshed, original behavior).
   
   I find it a little weird that zero and negative numbers have different implications. why distinguish those two? @kbendick can you share some context?
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1013583775


##########
flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java:
##########
@@ -102,14 +102,18 @@ public FlinkCatalog(
       String defaultDatabase,
       Namespace baseNamespace,
       CatalogLoader catalogLoader,
-      boolean cacheEnabled) {
+      boolean cacheEnabled,
+      long cacheExpirationIntervalMs) {
     super(catalogName, defaultDatabase);
     this.catalogLoader = catalogLoader;
     this.baseNamespace = baseNamespace;
     this.cacheEnabled = cacheEnabled;
 
     Catalog originalCatalog = catalogLoader.loadCatalog();
-    icebergCatalog = cacheEnabled ? CachingCatalog.wrap(originalCatalog) : originalCatalog;
+    icebergCatalog =
+        cacheEnabled

Review Comment:
   this breaks the compatibility. I feel the benefit probably doesn't justify the breakage. we should keep the boolean config and just support the additional expire interval.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1016049656


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   @lvyanquan because we retained the boolean flag for enabling caching, I thought the refresh interval should just be positive. But apparently, I didn't get the implication of negative values. So forget about this comment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1017299727


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   In my opinion, these two parameters actually have some overlap.
   If we introduce a value of 0, we actually don't need ‘cache-enabled’, which is exactly what @hililiwei said before.
   ***
   Can we drop cacheEnabled and just use cacheExpirationIntervalMs? When cacheExpirationIntervalMs is zero, it is disabled.https://github.com/apache/iceberg/pull/6111#discussion_r1012530287
   ***
   Here is just for compatibility with the previous configuration.
   Or we can remove this restriction:
   ```
      if (cacheExpirationIntervalMs == 0) {
         cacheEnabled = false;
       }
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1022200851


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   latest code looks good to me. Will merge tomorrow unless there are other comments.
   
   We are on the same page regarding backward compatibility for keeping the boolean flag.
   
   > cacheExpirationIntervalMs is not allowed to be 0
   
   agree with the Precondition.  on the other hand, I think it is is a bit unnecessary for the CachingCatalog. it could have treated `<=0` as no expiration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu merged pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu merged PR #6111:
URL: https://github.com/apache/iceberg/pull/6111


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1014763947


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   we should just do a `Preconditions` check here that this is a positive number.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
hililiwei commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1015087242


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   > * If cache.expiration-interval-ms is zero, caching will be turned off entirely (irrespective of the cache-enabled flag)
   > * If cache.expiration-interval-ms is positive (with cache-enabled=true), then caching will be on and entries will expire if not accessed in that period of time.
   > * If users pass in a value <= -1, (with cache-enabled=true), then caching will be enabled and cache expiration will never happen (i.e. entries will persist unless explicitly refreshed, original behavior).
   
   detail ref:  https://github.com/apache/iceberg/pull/3543#issue-1052281636



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1021333953


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   Addressed for these suggestions:
   >   * if `cacheExpirationIntervalMs` is unset then never expire the element to keep the backwards compatibility.    
   
   1. Use CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_OFF as default value to keep the same behavior.
   
   >   * if <=0, caching never expire automatically. this is the difference. don'y disable caching with value 0 because boolean flag serves the purpose already.
   
   2. Howerver, 0 value is excluded in the constructor of [CachingCatalog](https://github.com/apache/iceberg/blob/dbb8a404f6632a55acb36e949f0e7b84b643cede/core/src/main/java/org/apache/iceberg/CachingCatalog.java) as the following code:
   ```
     protected CachingCatalog(
         Catalog catalog, boolean caseSensitive, long expirationIntervalMillis, Ticker ticker) {
       Preconditions.checkArgument(
           expirationIntervalMillis != 0,
           "When %s is set to 0, the catalog cache should be disabled. This indicates a bug.",
           CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS);
       this.catalog = catalog;
       this.caseSensitive = caseSensitive;
       this.expirationIntervalMillis = expirationIntervalMillis;
       this.tableCache = createTableCache(ticker);
     }
   ```
   So cacheExpirationIntervalMs is not allowed to be 0, add this restriction to Precondition and Document too.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan commented on pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan commented on PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#issuecomment-1301972492

   I wonder whether should I add doc about it since https://iceberg.apache.org/docs/latest/configuration/#catalog-properties has introduced it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1013586945


##########
flink/v1.14/flink/src/test/java/org/apache/iceberg/flink/TestFlinkCatalogTablePartitions.java:
##########
@@ -18,14 +18,13 @@
  */
 package org.apache.iceberg.flink;
 
-import static org.apache.iceberg.flink.FlinkCatalogFactory.CACHE_ENABLED;

Review Comment:
   got it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1019761610


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   I think the key decision is whether to keep the same [parameter definitions](https://iceberg.apache.org/docs/latest/configuration/#catalog-properties) as before or to remove and verify unnecessary 0 values in Flink. I agree with the same configuration as Spark, because the parameter is public. 
   But Flink has a separate [configuration page](https://iceberg.apache.org/docs/latest/flink/#catalog-configuration), so redefining is also acceptable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
hililiwei commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1012530287


##########
flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java:
##########
@@ -102,14 +102,18 @@ public FlinkCatalog(
       String defaultDatabase,
       Namespace baseNamespace,
       CatalogLoader catalogLoader,
-      boolean cacheEnabled) {
+      boolean cacheEnabled,
+      long cacheExpirationIntervalMs) {
     super(catalogName, defaultDatabase);
     this.catalogLoader = catalogLoader;
     this.baseNamespace = baseNamespace;
     this.cacheEnabled = cacheEnabled;
 
     Catalog originalCatalog = catalogLoader.loadCatalog();
-    icebergCatalog = cacheEnabled ? CachingCatalog.wrap(originalCatalog) : originalCatalog;
+    icebergCatalog =
+        cacheEnabled

Review Comment:
   Can we drop `cacheEnabled` and just use `cacheExpirationIntervalMs`?  When `cacheExpirationIntervalMs` is zero, it is disabled.



##########
flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,26 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled =
+        Boolean.parseBoolean(properties.getOrDefault(CatalogProperties.CACHE_ENABLED, "true"));

Review Comment:
   Now that you have introduced `PropertyUtil`, can we unify the style?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1012786416


##########
flink/v1.15/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java:
##########
@@ -102,14 +102,18 @@ public FlinkCatalog(
       String defaultDatabase,
       Namespace baseNamespace,
       CatalogLoader catalogLoader,
-      boolean cacheEnabled) {
+      boolean cacheEnabled,
+      long cacheExpirationIntervalMs) {
     super(catalogName, defaultDatabase);
     this.catalogLoader = catalogLoader;
     this.baseNamespace = baseNamespace;
     this.cacheEnabled = cacheEnabled;
 
     Catalog originalCatalog = catalogLoader.loadCatalog();
-    icebergCatalog = cacheEnabled ? CachingCatalog.wrap(originalCatalog) : originalCatalog;
+    icebergCatalog =
+        cacheEnabled

Review Comment:
   thanks for pointing, address it. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1013582106


##########
flink/v1.14/flink/src/test/java/org/apache/iceberg/flink/TestFlinkCatalogTablePartitions.java:
##########
@@ -18,14 +18,13 @@
  */
 package org.apache.iceberg.flink;
 
-import static org.apache.iceberg.flink.FlinkCatalogFactory.CACHE_ENABLED;

Review Comment:
   why this change? is it from spotless formatting?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#issuecomment-1315590390

   thanks @lvyanquan for the contribution and @hililiwei and @pvary for the reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan commented on pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan commented on PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#issuecomment-1303144296

   Resubmitted to port the change to 1.16 module. 
   "cache-enabled" is reserved now, since users who set "cache-enabled" to "false" before would need to add property "cache.expiration-interval-ms = -1" to an existed catalog, It's extra work for users.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1020938017


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   > But Flink has a separate [configuration page](https://iceberg.apache.org/docs/latest/flink/#catalog-configuration), so redefining is also acceptable.
   
   I had the same thought. Here is what is intuitive to me.
   * if the boolean flag is false, caching is disabled
   * if the boolean flag is true
     * if  cacheExpirationIntervalMs > 0, caching automatically expired after the value
     * if <=0, caching never expire automatically
   
   What are you takes? @lvyanquan @hililiwei @pvary 
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1016788395


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   if we look at the code of `CachingCatalog`, it only check positive values. Hence don't understand why we need value zero to disable caching since we already have the boolean flag. `positive` refresh interval would be caching with expiration. otherwise, it is caching without expiration (until refreshed).
   
   ```
     private Cache<TableIdentifier, Table> createTableCache(Ticker ticker) {
       Caffeine<Object, Object> cacheBuilder = Caffeine.newBuilder().softValues();
   
       if (expirationIntervalMillis > 0) {
         return cacheBuilder
             .removalListener(new MetadataTableInvalidatingRemovalListener())
             .executor(Runnable::run) // Makes the callbacks to removal listener synchronous
             .expireAfterAccess(Duration.ofMillis(expirationIntervalMillis))
             .ticker(ticker)
             .build();
       }
   
       return cacheBuilder.build();
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1017299727


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   In my opinion, these two parameters actually have some overlap.
   If we introduce a value of 0, we actually don't need ‘cache-enabled’, which is exactly what @hililiwei said before.
   ***
   Can we drop cacheEnabled and just use cacheExpirationIntervalMs? When cacheExpirationIntervalMs is zero, it is disabled.https://github.com/apache/iceberg/pull/6111#discussion_r1012530287
   ***
   Here is just for compatibility with the previous configuration.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1017328011


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   I understood the overlap btw the boolean flag and the expire value at 0. But dropping `cacheEnabled ` would be a breaking change, which we should try to avoid



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1013583303


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,14 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+
+    return new FlinkCatalog(
+        name, defaultDatabase, baseNamespace, catalogLoader, cacheExpirationIntervalMs);

Review Comment:
   this breaks the compatibility. I feel the benefit probably doesn't justify the breakage. we should keep the boolean config and just support the additional expire interval.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
stevenzwu commented on PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#issuecomment-1302927593

   @hililiwei 's contribution of Flink 1.16 was just merged. @lvyanquan can you also port the change to 1.16 module?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1014938540


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   Not sure why limit to positive numbers, since negative values mean "cache expiration is turned off and entries expire only on refresh etc", users can set a negative values.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a diff in pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
pvary commented on code in PR #6111:
URL: https://github.com/apache/iceberg/pull/6111#discussion_r1021208714


##########
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java:
##########
@@ -145,8 +145,27 @@ protected Catalog createCatalog(
       baseNamespace = Namespace.of(properties.get(BASE_NAMESPACE).split("\\."));
     }
 
-    boolean cacheEnabled = Boolean.parseBoolean(properties.getOrDefault(CACHE_ENABLED, "true"));
-    return new FlinkCatalog(name, defaultDatabase, baseNamespace, catalogLoader, cacheEnabled);
+    boolean cacheEnabled = CatalogProperties.CACHE_ENABLED_DEFAULT;
+    if (properties.containsKey(CatalogProperties.CACHE_ENABLED)) {
+      cacheEnabled = Boolean.parseBoolean(properties.get(CatalogProperties.CACHE_ENABLED));
+    }
+
+    long cacheExpirationIntervalMs =
+        PropertyUtil.propertyAsLong(
+            properties,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS,
+            CatalogProperties.CACHE_EXPIRATION_INTERVAL_MS_DEFAULT);
+    if (cacheExpirationIntervalMs == 0) {

Review Comment:
   Coming from a world where we were a platform for multiple companies, I would chose the more conservative approach and try to avoid breaking changes unless it is strictly necessary.
   
   Also adding a new configuration which just provides another way to archive the same function seems something which will lead to confusion down the line, so I would opt for:
   - keep the  `cache-enabled` boolean flag
   - if the `cache-enabled` is `true` find meaningful values for the `cacheExpirationIntervalMs`, or throw an error message if they do not make sense, so:
      - if `cacheExpirationIntervalMs` is > 0 then expire the the element
      - if `cacheExpirationIntervalMs` is <=0 then never expire the element
      - if `cacheExpirationIntervalMs` is unset then never expire the element to keep the backwards compatibility
   - if the `cache-enabled` is `false` then ignore the `cacheExpirationIntervalMs` or throw an exception if it is set.
   
   Just my 2 cents



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] lvyanquan closed pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog

Posted by GitBox <gi...@apache.org>.
lvyanquan closed pull request #6111: Flink: Add 'cache.expiration-interval-ms' option to FlinkCatalog
URL: https://github.com/apache/iceberg/pull/6111


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org