You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/05/28 05:29:45 UTC

[GitHub] [iceberg] StefanXiepj opened a new pull request #2649: Hive: CachedClientPool connection leak

StefanXiepj opened a new pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] StefanXiepj commented on a change in pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
StefanXiepj commented on a change in pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#discussion_r641778545



##########
File path: core/src/main/java/org/apache/iceberg/ClientPoolImpl.java
##########
@@ -140,4 +148,12 @@ private void release(C client) {
   public int poolSize() {
     return poolSize;
   }
+
+  public static AtomicInteger openCount() {
+    return openCount;
+  }
+
+  public static AtomicInteger closeCount() {
+    return closeCount;

Review comment:
       Thanks,your suggestion is very helpful, i'll do it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rymurr commented on a change in pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
rymurr commented on a change in pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#discussion_r641526618



##########
File path: core/src/main/java/org/apache/iceberg/ClientPoolImpl.java
##########
@@ -36,6 +38,10 @@
   private volatile int currentSize;
   private boolean closed;
 
+  private static volatile AtomicInteger openCount = new AtomicInteger(0);

Review comment:
       I am not sure if I understand the purpose of these three variables

##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/CachedClientPool.java
##########
@@ -65,6 +65,8 @@ private synchronized void init() {
               .removalListener((key, value, cause) -> ((HiveClientPool) value).close())
               .build();
     }
+    // cleanup cache before jvm exit, avoid some clients may not be closed.
+    Runtime.getRuntime().addShutdownHook(new Thread(CachedClientPool::cleanupCache));

Review comment:
       What is the downside of not closing the clients when the JVM exits? IMHO the lifetime of the client pool should be tied to the owning resources and be cleaned up through a `Closeable` interface rather than at system exit




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] southernriver commented on a change in pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
southernriver commented on a change in pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#discussion_r641463827



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/CachedClientPool.java
##########
@@ -65,6 +65,8 @@ private synchronized void init() {
               .removalListener((key, value, cause) -> ((HiveClientPool) value).close())
               .build();
     }
+    // cleanup cache before jvm exit, avoid some clients may not be closed.
+    Runtime.getRuntime().addShutdownHook(new Thread(CachedClientPool::cleanupCache));

Review comment:
       We should close the HiveClientPool  first~
   
   ```
   CachedClientPool.clientPoolCache().getIfPresent(metastoreUri).close();;
   CachedClientPool.clientPoolCache().invalidateAll();
   ```
   

##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/CachedClientPool.java
##########
@@ -65,6 +65,8 @@ private synchronized void init() {
               .removalListener((key, value, cause) -> ((HiveClientPool) value).close())
               .build();
     }
+    // cleanup cache before jvm exit, avoid some clients may not be closed.
+    Runtime.getRuntime().addShutdownHook(new Thread(CachedClientPool::cleanupCache));

Review comment:
       We should close the HiveClientPool  first~
   
   ```
   CachedClientPool.clientPoolCache().getIfPresent(metastoreUri).close();
   CachedClientPool.clientPoolCache().invalidateAll();
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] southernriver commented on a change in pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
southernriver commented on a change in pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#discussion_r641463827



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/CachedClientPool.java
##########
@@ -65,6 +65,8 @@ private synchronized void init() {
               .removalListener((key, value, cause) -> ((HiveClientPool) value).close())
               .build();
     }
+    // cleanup cache before jvm exit, avoid some clients may not be closed.
+    Runtime.getRuntime().addShutdownHook(new Thread(CachedClientPool::cleanupCache));

Review comment:
       We should close the HiveClientPool  first~
   
   ```
   CachedClientPool.clientPoolCache().getIfPresent(metastoreUri);
   CachedClientPool.clientPoolCache().invalidateAll();
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] StefanXiepj closed pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
StefanXiepj closed pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] StefanXiepj commented on a change in pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
StefanXiepj commented on a change in pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#discussion_r641778727



##########
File path: core/src/main/java/org/apache/iceberg/ClientPoolImpl.java
##########
@@ -36,6 +38,10 @@
   private volatile int currentSize;
   private boolean closed;
 
+  private static volatile AtomicInteger openCount = new AtomicInteger(0);

Review comment:
       Thanks for your time, yes your right, it's the total number of connections ever opened and the total number of connections closed,just for unit test, maybe the log is also useful.
   As @southernriver  suggested,  closed directly would be better. it is no need to wait.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] StefanXiepj commented on a change in pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
StefanXiepj commented on a change in pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#discussion_r641778727



##########
File path: core/src/main/java/org/apache/iceberg/ClientPoolImpl.java
##########
@@ -36,6 +38,10 @@
   private volatile int currentSize;
   private boolean closed;
 
+  private static volatile AtomicInteger openCount = new AtomicInteger(0);

Review comment:
       Thanks for your time, yes your right, it's the total number of connections ever opened and the total number of connections closed,just for unit test, maybe the log is also useful.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] StefanXiepj commented on pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
StefanXiepj commented on pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#issuecomment-851154326


   @rymurr @kbendick @southernriver thanks, i will close this pr.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] southernriver commented on pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
southernriver commented on pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#issuecomment-874790164


   @StefanXiepj   I  reopened this and do some fix in another pr [#2785](https://github.com/apache/iceberg/pull/2785)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a change in pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
kbendick commented on a change in pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#discussion_r641731970



##########
File path: core/src/main/java/org/apache/iceberg/ClientPoolImpl.java
##########
@@ -140,4 +148,12 @@ private void release(C client) {
   public int poolSize() {
     return poolSize;
   }
+
+  public static AtomicInteger openCount() {
+    return openCount;
+  }
+
+  public static AtomicInteger closeCount() {
+    return closeCount;

Review comment:
       Would it make more sense to call `.get()` in the return statement and just return an `Integer`?
   
   It seems the only places in the code that these functions are called, `.get()` is called immediately. It seems to me that calling `.get()` would provide the benefit of a defensive copy to truly leave these private.

##########
File path: core/src/main/java/org/apache/iceberg/ClientPoolImpl.java
##########
@@ -36,6 +38,10 @@
   private volatile int currentSize;
   private boolean closed;
 
+  private static volatile AtomicInteger openCount = new AtomicInteger(0);

Review comment:
       I'm a little confused on the variables as well.
   
   If I'm not mistaken, it's the total number of connections ever opened and the total number of connections closed? Looking at the declaration, I would have thought that it was the current number of open connections, not the number of connections ever opened.
   
   My only indication that it might be otherwise came from the unit test - https://github.com/apache/iceberg/pull/2649/files#diff-8b2035adec2ca6d247f624aeb6dbb2cc3789bd188541c500de4319b969813758R78-R80
   
   ```java
     private void waitingForCleanup() {
        long startTime = System.currentTimeMillis();
        while ((ClientPoolImpl.openCount().get() != ClientPoolImpl.closeCount().get()) 
   ```
   
   Maybe you could add a small comment?
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] StefanXiepj commented on a change in pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
StefanXiepj commented on a change in pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#discussion_r641532345



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/CachedClientPool.java
##########
@@ -65,6 +65,8 @@ private synchronized void init() {
               .removalListener((key, value, cause) -> ((HiveClientPool) value).close())
               .build();
     }
+    // cleanup cache before jvm exit, avoid some clients may not be closed.
+    Runtime.getRuntime().addShutdownHook(new Thread(CachedClientPool::cleanupCache));

Review comment:
       If a large number of tasks exit, many connections may not be closed. But as you said, this is not a serious problem.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] StefanXiepj commented on a change in pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
StefanXiepj commented on a change in pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#discussion_r641533908



##########
File path: core/src/main/java/org/apache/iceberg/ClientPoolImpl.java
##########
@@ -36,6 +38,10 @@
   private volatile int currentSize;
   private boolean closed;
 
+  private static volatile AtomicInteger openCount = new AtomicInteger(0);

Review comment:
       This is just for counting the number of connections,maybe it can be used as a metric in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] StefanXiepj commented on pull request #2649: Hive: CachedClientPool connection leak

Posted by GitBox <gi...@apache.org>.
StefanXiepj commented on pull request #2649:
URL: https://github.com/apache/iceberg/pull/2649#issuecomment-850260775


   @rymurr @kbendick  would you mind reviewing this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org