You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/09/10 17:14:56 UTC

[GitHub] [iceberg] szehon-ho opened a new pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

szehon-ho opened a new pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099


   The options are: implement it in ClientPoolImpl, or use RetryingMetaStoreClient.  From the initial discussion, leaning towards option 2. Some justifications below :
   
   * RetryingMetaStoreClient is used today in Hive and Spark, and is more battle-tested. HiveClientPool will have to catch up to all the Hive exception types to retry: https://github.com/apache/iceberg/pull/2844 for some missing exceptions 
   * Handles UGI impersonation logic for reconnect, which is missing HiveClientPool (needed in Kerberized environments)
   * ClientPoolImpl does not support any configuration of retry and retry-backoff. It hard-codes to 1 retry and no backoff, (the default in RetryingMetaStoreClient is 1s backoff for instance)
   * Re-using RetryingMetaStoreClient can unify all hive configs for the execution engine, instead of having the configs per catalog. For instance in Spark, setting 'spark.hadoop.hive.metastore.client.connect.retry.delay' will set it for all Hive connections (for Iceberg and non-Iceberg tables)
   
   
   Implementation details: adding RetryingMetaStoreClient will make redundant (and harmful) the ClientPoolImpl retry, so disable it for this case, but not remove it to preserve for other client pools.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-945894573


   Thanks for getting this in, @pvary and @szehon-ho!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r722205879



##########
File path: hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveClientPool.java
##########
@@ -111,7 +111,6 @@ public void testGetTablesFailsForNonReconnectableException() throws Exception {
             "Another meta exception", () -> clients.run(client -> client.getTables("default", "t")));
   }
 
-  @Test

Review comment:
       Is it intentional that we removed this annotation?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #3099: Core: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r715797076



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -21,22 +21,24 @@
 
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hive.conf.HiveConf;
-import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.RetryingMetaStoreClient;
 import org.apache.hadoop.hive.metastore.api.MetaException;
 import org.apache.iceberg.ClientPoolImpl;
-import org.apache.iceberg.common.DynConstructors;
+import org.apache.iceberg.common.DynMethods;
 import org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting;
 import org.apache.thrift.TException;
 import org.apache.thrift.transport.TTransportException;
 
-public class HiveClientPool extends ClientPoolImpl<HiveMetaStoreClient, TException> {
+public class HiveClientPool extends ClientPoolImpl<IMetaStoreClient, TException> {
 
-  // use appropriate ctor depending on whether we're working with Hive2 or Hive3 dependencies
-  // we need to do this because there is a breaking API change between Hive2 and Hive3
-  private static final DynConstructors.Ctor<HiveMetaStoreClient> CLIENT_CTOR = DynConstructors.builder()
-          .impl(HiveMetaStoreClient.class, HiveConf.class)
-          .impl(HiveMetaStoreClient.class, Configuration.class)
-          .build();
+  // use appropriate ctor depending on whether we're working with Hive1, Hive2, or Hive3 dependencies
+  // we need to do this because there is a breaking API change between Hive1, Hive2, and Hive3
+  private static final DynMethods.StaticMethod CLIENT_CTOR = DynMethods.builder("getProxy")
+      .impl(RetryingMetaStoreClient.class, HiveConf.class)
+      .impl(RetryingMetaStoreClient.class, HiveConf.class, Boolean.TYPE)
+      .impl(RetryingMetaStoreClient.class, Configuration.class, Boolean.TYPE)

Review comment:
       Yes, thanks again @jackye1995  for the pointers.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-924646575


   1. the title does not include the fact that we are switching to use `RetryingMetaStoreClient`, can we add that information in the title?
   2. I have not checked yet, but you mentioned about the fact that some configs can be used for both Iceberg and non-Iceberg tables. Does that mean on Iceberg side some properties has to be re-mapped back to Hive configs after the switch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r720109246



##########
File path: hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveClientPool.java
##########
@@ -110,57 +103,4 @@ public void testGetTablesFailsForNonReconnectableException() throws Exception {
     AssertHelpers.assertThrows("Should throw exception", MetaException.class,
             "Another meta exception", () -> clients.run(client -> client.getTables("default", "t")));
   }
-
-  @Test
-  public void testConnectionFailureRestoreForMetaException() throws Exception {

Review comment:
       Maybe we could make Hive lock() SafeToRetry?
   I have to think this through, but if we release the enqueued lock in case of unexpected error inside implementation, and also delete the previous version of the lock if someone already created a lock for this transaction.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r711826950



##########
File path: hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveClientPool.java
##########
@@ -110,57 +103,4 @@ public void testGetTablesFailsForNonReconnectableException() throws Exception {
     AssertHelpers.assertThrows("Should throw exception", MetaException.class,
             "Another meta exception", () -> clients.run(client -> client.getTables("default", "t")));
   }
-
-  @Test
-  public void testConnectionFailureRestoreForMetaException() throws Exception {

Review comment:
       These tests seem valuable if the client doesn't retry on its own. What about passing in an option to turn on retries and then using that to test retries here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-921194958


   By the way, update, the issue for us about lock was actually a deterministic one:  https://issues.apache.org/jira/browse/HIVE-25522. (a corrupted HMS instance due to bad startup).  So not sure how big of an issue that is anymore (RetryingMetaStoreClient not retrying lock() calls for safety reason). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-918656206


   Retriggering.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-918751593


   @pvary and @rdblue  proposal as we discussed.
   
   One side effect (not what I expected), is that due to https://issues.apache.org/jira/browse/HIVE-13014, lock() is now explicitly not allowed to retry.  On the one hand, as correct-ness wise it makes sense to avoid retry (its possible the server actually got the lock but does not return to the client, leaving a dangling lock).   On the other hand, it's a bit of a bummer as lock() is called quite frequently on Iceberg side on every commit and is a bit prone to random failures, ie https://issues.apache.org/jira/browse/HIVE-23052


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on a change in pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r713644090



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -21,22 +21,24 @@
 
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hive.conf.HiveConf;
-import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.RetryingMetaStoreClient;
 import org.apache.hadoop.hive.metastore.api.MetaException;
 import org.apache.iceberg.ClientPoolImpl;
-import org.apache.iceberg.common.DynConstructors;
+import org.apache.iceberg.common.DynMethods;
 import org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting;
 import org.apache.thrift.TException;
 import org.apache.thrift.transport.TTransportException;
 
-public class HiveClientPool extends ClientPoolImpl<HiveMetaStoreClient, TException> {
+public class HiveClientPool extends ClientPoolImpl<IMetaStoreClient, TException> {
 
-  // use appropriate ctor depending on whether we're working with Hive2 or Hive3 dependencies
-  // we need to do this because there is a breaking API change between Hive2 and Hive3
-  private static final DynConstructors.Ctor<HiveMetaStoreClient> CLIENT_CTOR = DynConstructors.builder()
-          .impl(HiveMetaStoreClient.class, HiveConf.class)
-          .impl(HiveMetaStoreClient.class, Configuration.class)
-          .build();
+  // use appropriate ctor depending on whether we're working with Hive1, Hive2, or Hive3 dependencies
+  // we need to do this because there is a breaking API change between Hive1, Hive2, and Hive3
+  private static final DynMethods.StaticMethod CLIENT_CTOR = DynMethods.builder("getProxy")
+      .impl(RetryingMetaStoreClient.class, HiveConf.class)
+      .impl(RetryingMetaStoreClient.class, HiveConf.class, Boolean.TYPE)
+      .impl(RetryingMetaStoreClient.class, Configuration.class, Boolean.TYPE)

Review comment:
       It seems fine, newer versions don't even have the first constructor:
   
   https://hive.apache.org/javadocs/r3.1.2/api/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.html
   
   https://hive.apache.org/javadocs/r2.1.1/api/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.html




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary merged pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
pvary merged pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r722204645



##########
File path: core/src/main/java/org/apache/iceberg/ClientPoolImpl.java
##########
@@ -33,25 +33,32 @@
   private final Deque<C> clients;
   private final Class<? extends E> reconnectExc;
   private final Object signal = new Object();
+  private final boolean retryByDefault;
   private volatile int currentSize;
   private boolean closed;
 
-  public ClientPoolImpl(int poolSize, Class<? extends E> reconnectExc) {
+  public ClientPoolImpl(int poolSize, Class<? extends E> reconnectExc, boolean retryByDefault) {
     this.poolSize = poolSize;
     this.reconnectExc = reconnectExc;
     this.clients = new ArrayDeque<>(poolSize);
     this.currentSize = 0;
     this.closed = false;
+    this.retryByDefault = retryByDefault;
   }
 
   @Override
   public <R> R run(Action<R, C, E> action) throws E, InterruptedException {
+    return run(action, retryByDefault);
+  }
+
+  @Override
+  public <R> R run(Action<R, C, E> action, boolean retry) throws E, InterruptedException {
     C client = get();
     try {
       return action.run(client);
 
     } catch (Exception exc) {
-      if (isConnectionException(exc)) {
+      if (isConnectionException(exc) && retry) {

Review comment:
       nit: maybe a different order of checks for efficiency?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on a change in pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r713647520



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -47,10 +49,15 @@ public HiveClientPool(int poolSize, Configuration conf) {
   }
 
   @Override
-  protected HiveMetaStoreClient newClient()  {
+  protected boolean shouldRetry() {
+    return false; // Already use RetryingMetaStoreClient
+  }
+
+  @Override
+  protected IMetaStoreClient newClient()  {
     try {
       try {
-        return CLIENT_CTOR.newInstance(hiveConf);
+        return CLIENT_CTOR.invoke(hiveConf, true);

Review comment:
       it's not about retry, but to configure `allowEmbedded` for the `HiveMetaStoreClient` initialized internally in the `RetryingMetaStoreClient`:
   
   https://github.com/apache/hive/blob/9f03e3c416834b18faf7a85338b130ed0742275f/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java#L165-L167




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho edited a comment on pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho edited a comment on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-918656206


   Retriggering, looks like past hang due to: https://github.com/apache/iceberg/pull/3110


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #3099: Core: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r715796353



##########
File path: hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveClientPool.java
##########
@@ -110,57 +103,4 @@ public void testGetTablesFailsForNonReconnectableException() throws Exception {
     AssertHelpers.assertThrows("Should throw exception", MetaException.class,
             "Another meta exception", () -> clients.run(client -> client.getTables("default", "t")));
   }
-
-  @Test
-  public void testConnectionFailureRestoreForMetaException() throws Exception {

Review comment:
       Good idea, done.  Put the test back.
   
   Side note: the flexibility of making it configurable is nice, and is a potential solution to the issue I mentioned, I am tempted to set it to true in HiveClientOperations::acquireLock in this line.
   
   ```
   LockResponse lockResponse = metaClients.run(client -> client.lock(lockRequest));
   ```
   
   I can see from Hive point of view why they don't want to auto-retry, but most exception scenarios are not going to be in the corner case of having succeeded in the backend and leaving the lock.  In all honesty, if the Iceberg job fails because lock() throws TTransportException, end user will just try again but it is a wasted effort.  Not sure if any thoughts on it? 
    cc @pvary 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho edited a comment on pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho edited a comment on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-918751593


   @pvary and @rdblue  the change as we discussed, if you guys want to take a look.
   
   After doing some tests with it, one unexpected side effect to call out is that due to https://issues.apache.org/jira/browse/HIVE-13014, lock() is now explicitly not allowed to retry.  On the one hand, correct-ness wise it makes sense to avoid retry (its possible the server actually got the lock but does not return to the client, leaving a dangling lock).   On the other hand, it's a bit of a bummer as lock() is called quite frequently on Iceberg side on every commit and is a bit prone to random failures, ie https://issues.apache.org/jira/browse/HIVE-23052


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho edited a comment on pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho edited a comment on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-921194958


   By the way, update, lock() is not particularly more suspectible to random failure.  The issue for us about lock was actually a deterministic one:  https://issues.apache.org/jira/browse/HIVE-25522. (a corrupted HMS instance due to bad startup).  So not sure how big of an issue that is anymore (RetryingMetaStoreClient not retrying lock() calls for safety reason). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r716568468



##########
File path: hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveClientPool.java
##########
@@ -110,57 +103,4 @@ public void testGetTablesFailsForNonReconnectableException() throws Exception {
     AssertHelpers.assertThrows("Should throw exception", MetaException.class,
             "Another meta exception", () -> clients.run(client -> client.getTables("default", "t")));
   }
-
-  @Test
-  public void testConnectionFailureRestoreForMetaException() throws Exception {

Review comment:
       Having both Iceberg and Hive retry could also be problematic / surprising. We will just do NxM retries (2x2 in the default case).
   
   I see 2 typical error scenario in case of `lock`:
   
   1. Can not reach the HMS - connection is broken, or something like this - Retry is a perfect solution
   2. The locks are queued, but we have to wait and something happens in the meantime - Retry will be blocked forever (or until the lock timeout is reached for the previously queued locks) - Retry is problematic / unnecessary.
   
   How typical is that the first request on the connection is the `lock`? If it happens often then the 1st situation could happen often. If we usually get the metadata for the table and then try to lock then the 1st situation is unlikely to happen.
   
   I would prefer to keep the retry login in one place (`RetryingMetaStoreClient`) and simplify the Iceberg code by removing the retry altogether, but I can accept keeping the code as it is - we most probably will turn off the Iceberg retries in our deployment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r726568845



##########
File path: core/src/main/java/org/apache/iceberg/ClientPoolImpl.java
##########
@@ -33,25 +33,32 @@
   private final Deque<C> clients;
   private final Class<? extends E> reconnectExc;
   private final Object signal = new Object();
+  private final boolean retryByDefault;
   private volatile int currentSize;
   private boolean closed;
 
-  public ClientPoolImpl(int poolSize, Class<? extends E> reconnectExc) {
+  public ClientPoolImpl(int poolSize, Class<? extends E> reconnectExc, boolean retryByDefault) {
     this.poolSize = poolSize;
     this.reconnectExc = reconnectExc;
     this.clients = new ArrayDeque<>(poolSize);
     this.currentSize = 0;
     this.closed = false;
+    this.retryByDefault = retryByDefault;
   }
 
   @Override
   public <R> R run(Action<R, C, E> action) throws E, InterruptedException {
+    return run(action, retryByDefault);
+  }
+
+  @Override
+  public <R> R run(Action<R, C, E> action, boolean retry) throws E, InterruptedException {
     C client = get();
     try {
       return action.run(client);
 
     } catch (Exception exc) {
-      if (isConnectionException(exc)) {
+      if (isConnectionException(exc) && retry) {

Review comment:
       Makes sense, done

##########
File path: hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveClientPool.java
##########
@@ -111,7 +111,6 @@ public void testGetTablesFailsForNonReconnectableException() throws Exception {
             "Another meta exception", () -> clients.run(client -> client.getTables("default", "t")));
   }
 
-  @Test

Review comment:
       Whoops, put it back.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #3099: Core: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r715796353



##########
File path: hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveClientPool.java
##########
@@ -110,57 +103,4 @@ public void testGetTablesFailsForNonReconnectableException() throws Exception {
     AssertHelpers.assertThrows("Should throw exception", MetaException.class,
             "Another meta exception", () -> clients.run(client -> client.getTables("default", "t")));
   }
-
-  @Test
-  public void testConnectionFailureRestoreForMetaException() throws Exception {

Review comment:
       Good idea, done.  Put the test back.
   
   Side note: the flexibility of making it configurable is nice, and is a potential soution the issue I mentioned, I am tempted to set it to true in HiveClientOperations::acquireLock in this line to address the issue I mentioned above.
   
   ```
   LockResponse lockResponse = metaClients.run(client -> client.lock(lockRequest));
   ```
   
   I can see from Hive point of view why they don't want to auto-retry, but most exception scenarios are not going to be in the corner case of having succeeded in the backend and leaving the lock.  In all honesty, if the Iceberg job fails because lock() throws TTransportException, end user will just try again but it is a wasted effort.  Not sure if any thoughts on it? 
    cc @pvary 

##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -21,22 +21,24 @@
 
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hive.conf.HiveConf;
-import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.RetryingMetaStoreClient;
 import org.apache.hadoop.hive.metastore.api.MetaException;
 import org.apache.iceberg.ClientPoolImpl;
-import org.apache.iceberg.common.DynConstructors;
+import org.apache.iceberg.common.DynMethods;
 import org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting;
 import org.apache.thrift.TException;
 import org.apache.thrift.transport.TTransportException;
 
-public class HiveClientPool extends ClientPoolImpl<HiveMetaStoreClient, TException> {
+public class HiveClientPool extends ClientPoolImpl<IMetaStoreClient, TException> {
 
-  // use appropriate ctor depending on whether we're working with Hive2 or Hive3 dependencies
-  // we need to do this because there is a breaking API change between Hive2 and Hive3
-  private static final DynConstructors.Ctor<HiveMetaStoreClient> CLIENT_CTOR = DynConstructors.builder()
-          .impl(HiveMetaStoreClient.class, HiveConf.class)
-          .impl(HiveMetaStoreClient.class, Configuration.class)
-          .build();
+  // use appropriate ctor depending on whether we're working with Hive1, Hive2, or Hive3 dependencies
+  // we need to do this because there is a breaking API change between Hive1, Hive2, and Hive3
+  private static final DynMethods.StaticMethod CLIENT_CTOR = DynMethods.builder("getProxy")

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho closed pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho closed pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r711826452



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -47,10 +49,15 @@ public HiveClientPool(int poolSize, Configuration conf) {
   }
 
   @Override
-  protected HiveMetaStoreClient newClient()  {
+  protected boolean shouldRetry() {
+    return false; // Already use RetryingMetaStoreClient
+  }
+
+  @Override
+  protected IMetaStoreClient newClient()  {
     try {
       try {
-        return CLIENT_CTOR.newInstance(hiveConf);
+        return CLIENT_CTOR.invoke(hiveConf, true);

Review comment:
       Can you please add a comment that documents the `true` here? Something like this:
   
   ```
     return GET_CLIENT.invoke(hiveConf, true /* use retries */);
   ```
   
   I'm assuming that the `true` is retry related? If so, should this detect whether the Hive implementation will retry and turn retries back on?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho edited a comment on pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho edited a comment on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-918751593


   @pvary and @rdblue  the change as we discussed, if you guys want to take a look.
   
   After doing some tests with it, one unexpeted side effect to call out is that due to https://issues.apache.org/jira/browse/HIVE-13014, lock() is now explicitly not allowed to retry.  On the one hand, as correct-ness wise it makes sense to avoid retry (its possible the server actually got the lock but does not return to the client, leaving a dangling lock).   On the other hand, it's a bit of a bummer as lock() is called quite frequently on Iceberg side on every commit and is a bit prone to random failures, ie https://issues.apache.org/jira/browse/HIVE-23052


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-940816196


   @szehon-ho: This version is fine by me. Let's wait a little bit for the others to chime in, in case they have any more concerns. If there is no objections till next Monday, I will merge then, as I will be OOO at the end of this week. Please ping me if I forget to push.
   
   Thanks, Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-941477137


   Thanks a lot Peter for following up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho edited a comment on pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho edited a comment on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-918751593


   @pvary and @rdblue  the change as we discussed, if you guys want to take a look.
   
   One side effect (not what I expected), is that due to https://issues.apache.org/jira/browse/HIVE-13014, lock() is now explicitly not allowed to retry.  On the one hand, as correct-ness wise it makes sense to avoid retry (its possible the server actually got the lock but does not return to the client, leaving a dangling lock).   On the other hand, it's a bit of a bummer as lock() is called quite frequently on Iceberg side on every commit and is a bit prone to random failures, ie https://issues.apache.org/jira/browse/HIVE-23052


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r715917213



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -21,22 +21,24 @@
 
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hive.conf.HiveConf;
-import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.RetryingMetaStoreClient;
 import org.apache.hadoop.hive.metastore.api.MetaException;
 import org.apache.iceberg.ClientPoolImpl;
-import org.apache.iceberg.common.DynConstructors;
+import org.apache.iceberg.common.DynMethods;
 import org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting;
 import org.apache.thrift.TException;
 import org.apache.thrift.transport.TTransportException;
 
-public class HiveClientPool extends ClientPoolImpl<HiveMetaStoreClient, TException> {
+public class HiveClientPool extends ClientPoolImpl<IMetaStoreClient, TException> {
 
-  // use appropriate ctor depending on whether we're working with Hive2 or Hive3 dependencies
-  // we need to do this because there is a breaking API change between Hive2 and Hive3
-  private static final DynConstructors.Ctor<HiveMetaStoreClient> CLIENT_CTOR = DynConstructors.builder()
-          .impl(HiveMetaStoreClient.class, HiveConf.class)
-          .impl(HiveMetaStoreClient.class, Configuration.class)
-          .build();
+  // use appropriate ctor depending on whether we're working with Hive1, Hive2, or Hive3 dependencies
+  // we need to do this because there is a breaking API change between Hive1, Hive2, and Hive3
+  private static final DynMethods.StaticMethod CLIENT_CTOR = DynMethods.builder("getProxy")
+      .impl(RetryingMetaStoreClient.class, HiveConf.class)
+      .impl(RetryingMetaStoreClient.class, HiveConf.class, Boolean.TYPE)
+      .impl(RetryingMetaStoreClient.class, Configuration.class, Boolean.TYPE)

Review comment:
       And the last link is Hive1 API (which gets pulled in in spark2 tests):  https://hive.apache.org/javadocs/r1.2.2/api/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.html




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #3099: Core: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r715796926



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -47,10 +49,15 @@ public HiveClientPool(int poolSize, Configuration conf) {
   }
 
   @Override
-  protected HiveMetaStoreClient newClient()  {
+  protected boolean shouldRetry() {
+    return false; // Already use RetryingMetaStoreClient
+  }
+
+  @Override
+  protected IMetaStoreClient newClient()  {
     try {
       try {
-        return CLIENT_CTOR.newInstance(hiveConf);
+        return CLIENT_CTOR.invoke(hiveConf, true);

Review comment:
       Yes, thanks @jackye1995  for the pointer.  From quick glance, seems a flag just intended to determine whether to throw an exception if "hive.metastore.uris" is not set, so not terribly useful to document, unless you feel otherwise.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho closed pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho closed pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho edited a comment on pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
szehon-ho edited a comment on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-918751593


   @pvary and @rdblue  the change as we discussed, if you guys want to take a look.
   
   After doing some tests with it, one unexpected side effect to call out for reviewers is that due to https://issues.apache.org/jira/browse/HIVE-13014, lock() is now explicitly not allowed to retry (ref:  https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java#L3413).  
   
   On the one hand, correct-ness wise it makes sense to avoid retry (its possible the server actually got the lock but does not return to the client, leaving a dangling lock).   On the other hand, it's a bit of a bummer as lock() is called quite frequently on Iceberg side on every commit and is a bit prone to random failures, ie https://issues.apache.org/jira/browse/HIVE-23052.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r717837774



##########
File path: hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveClientPool.java
##########
@@ -110,57 +103,4 @@ public void testGetTablesFailsForNonReconnectableException() throws Exception {
     AssertHelpers.assertThrows("Should throw exception", MetaException.class,
             "Another meta exception", () -> clients.run(client -> client.getTables("default", "t")));
   }
-
-  @Test
-  public void testConnectionFailureRestoreForMetaException() throws Exception {

Review comment:
       @pvary yea this PR disables the retry for Hive case, but leaves an option for turning it on to a particular caller, to address your concern.  
   
   About lock(), thanks for the reply, maybe you are right that it is not typical case to have TTransportException especially if getMetadata has already succeeded on same connection.  
   
   The only thing bothering me is that there is no way for the client to distinguish case 1 and 2 based on exception type, and thus nothing they can do really except retry.  There is a ShowLock call, and potentially they can check if there is a queued lock from the failed call, is it it is their own , and if so then unlock it.  But I'm not sure how reliable this is.  Anyway its maybe for another review then.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r717837774



##########
File path: hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveClientPool.java
##########
@@ -110,57 +103,4 @@ public void testGetTablesFailsForNonReconnectableException() throws Exception {
     AssertHelpers.assertThrows("Should throw exception", MetaException.class,
             "Another meta exception", () -> clients.run(client -> client.getTables("default", "t")));
   }
-
-  @Test
-  public void testConnectionFailureRestoreForMetaException() throws Exception {

Review comment:
       @pvary yea this PR disables the retry for Hive case, but leaves an option for turning it on to a particular caller, to address your concern.  
   
   About lock(), thanks for the reply, maybe you are right that it is not typical case to have TTransportException especially if getMetadata has already succeeded on same connection.  
   
   The only thing bothering me is that there is no way for the client to distinguish case 1 and 2 based on exception type, and thus nothing they can do really except retry.  There is a ShowLock call, and potentially they can check if there is a queued lock from the failed call, is it it is their own , and if so then unlock it.  But I'm not sure how reliable this is.  Anyway its maybe for another review then.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r711826555



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -21,22 +21,24 @@
 
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hive.conf.HiveConf;
-import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.RetryingMetaStoreClient;
 import org.apache.hadoop.hive.metastore.api.MetaException;
 import org.apache.iceberg.ClientPoolImpl;
-import org.apache.iceberg.common.DynConstructors;
+import org.apache.iceberg.common.DynMethods;
 import org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting;
 import org.apache.thrift.TException;
 import org.apache.thrift.transport.TTransportException;
 
-public class HiveClientPool extends ClientPoolImpl<HiveMetaStoreClient, TException> {
+public class HiveClientPool extends ClientPoolImpl<IMetaStoreClient, TException> {
 
-  // use appropriate ctor depending on whether we're working with Hive2 or Hive3 dependencies
-  // we need to do this because there is a breaking API change between Hive2 and Hive3
-  private static final DynConstructors.Ctor<HiveMetaStoreClient> CLIENT_CTOR = DynConstructors.builder()
-          .impl(HiveMetaStoreClient.class, HiveConf.class)
-          .impl(HiveMetaStoreClient.class, Configuration.class)
-          .build();
+  // use appropriate ctor depending on whether we're working with Hive1, Hive2, or Hive3 dependencies
+  // we need to do this because there is a breaking API change between Hive1, Hive2, and Hive3
+  private static final DynMethods.StaticMethod CLIENT_CTOR = DynMethods.builder("getProxy")

Review comment:
       This isn't really a client constructor anymore. Could you rename this `GET_CLIENT` or something?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #3099: Core: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-926810888


   Thanks for the review guys.  Addressed the feedback.
   
   For @jackye1995's questions, 
   
   1. Done
   2. I took a look but there should not be any existing Iceberg configs that need to be remapped.  Currently it is already taking all the hive client configurations from hiveconf object, and anything else is Iceberg-specific.  If i missed something, please do point out.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-940441345


   @pvary thanks for review, addressed comments, sorry for delay


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #3099: Implement retryDelays and retries in Iceberg Hive Client

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r711826266



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -21,22 +21,24 @@
 
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hive.conf.HiveConf;
-import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.RetryingMetaStoreClient;
 import org.apache.hadoop.hive.metastore.api.MetaException;
 import org.apache.iceberg.ClientPoolImpl;
-import org.apache.iceberg.common.DynConstructors;
+import org.apache.iceberg.common.DynMethods;
 import org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting;
 import org.apache.thrift.TException;
 import org.apache.thrift.transport.TTransportException;
 
-public class HiveClientPool extends ClientPoolImpl<HiveMetaStoreClient, TException> {
+public class HiveClientPool extends ClientPoolImpl<IMetaStoreClient, TException> {
 
-  // use appropriate ctor depending on whether we're working with Hive2 or Hive3 dependencies
-  // we need to do this because there is a breaking API change between Hive2 and Hive3
-  private static final DynConstructors.Ctor<HiveMetaStoreClient> CLIENT_CTOR = DynConstructors.builder()
-          .impl(HiveMetaStoreClient.class, HiveConf.class)
-          .impl(HiveMetaStoreClient.class, Configuration.class)
-          .build();
+  // use appropriate ctor depending on whether we're working with Hive1, Hive2, or Hive3 dependencies
+  // we need to do this because there is a breaking API change between Hive1, Hive2, and Hive3
+  private static final DynMethods.StaticMethod CLIENT_CTOR = DynMethods.builder("getProxy")
+      .impl(RetryingMetaStoreClient.class, HiveConf.class)
+      .impl(RetryingMetaStoreClient.class, HiveConf.class, Boolean.TYPE)
+      .impl(RetryingMetaStoreClient.class, Configuration.class, Boolean.TYPE)

Review comment:
       Order matters here. The first implementation found will be used, so this will use `getProxy(HiveConf)` before it will use `getProxy(HiveConf, Boolean)`. Is that intended?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-941477137


   Thanks a lot Peter for following up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on a change in pull request #3099: Core: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on a change in pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#discussion_r715917213



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveClientPool.java
##########
@@ -21,22 +21,24 @@
 
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.hive.conf.HiveConf;
-import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.RetryingMetaStoreClient;
 import org.apache.hadoop.hive.metastore.api.MetaException;
 import org.apache.iceberg.ClientPoolImpl;
-import org.apache.iceberg.common.DynConstructors;
+import org.apache.iceberg.common.DynMethods;
 import org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting;
 import org.apache.thrift.TException;
 import org.apache.thrift.transport.TTransportException;
 
-public class HiveClientPool extends ClientPoolImpl<HiveMetaStoreClient, TException> {
+public class HiveClientPool extends ClientPoolImpl<IMetaStoreClient, TException> {
 
-  // use appropriate ctor depending on whether we're working with Hive2 or Hive3 dependencies
-  // we need to do this because there is a breaking API change between Hive2 and Hive3
-  private static final DynConstructors.Ctor<HiveMetaStoreClient> CLIENT_CTOR = DynConstructors.builder()
-          .impl(HiveMetaStoreClient.class, HiveConf.class)
-          .impl(HiveMetaStoreClient.class, Configuration.class)
-          .build();
+  // use appropriate ctor depending on whether we're working with Hive1, Hive2, or Hive3 dependencies
+  // we need to do this because there is a breaking API change between Hive1, Hive2, and Hive3
+  private static final DynMethods.StaticMethod CLIENT_CTOR = DynMethods.builder("getProxy")
+      .impl(RetryingMetaStoreClient.class, HiveConf.class)
+      .impl(RetryingMetaStoreClient.class, HiveConf.class, Boolean.TYPE)
+      .impl(RetryingMetaStoreClient.class, Configuration.class, Boolean.TYPE)

Review comment:
       And the last link is 1 API (which gets pulled in in spark2 tests):  https://hive.apache.org/javadocs/r1.2.2/api/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.html




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #3099: Hive: Switch to RetryingHMSClient (allows configuration of retryDelays and retries)

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #3099:
URL: https://github.com/apache/iceberg/pull/3099#issuecomment-945676023


   Merged.
   Thanks for the PR @szehon-ho!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org