You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/12/28 07:41:08 UTC

[GitHub] [iceberg] Jecarm opened a new issue #1994: Failed to connect Hive Metastore using Hive client pool

Jecarm opened a new issue #1994:
URL: https://github.com/apache/iceberg/issues/1994


   ### Backgroud
   we have started two Hive Metastore servers named _Server1_ and _Server2_, and the Iceberg-0.9.1 config set is
   * spark.sql.catalog.hive_prod = org.apache.iceberg.spark.SparkCatalog
   * spark.sql.catalog.hive_prod.type = hive
   * spark.sql.catalog.hive_prod.uri = thrift://_Server1_, thrift://_Server2_
   * spark.sql.catalog.hive_prod.clients=2
   
   It is assume that the current thread pool has already built two connections _Server1_ and _Server2_
   
   ### Operation and Problem
   1. We send a request to HMS using client pool connection with operation 'table listing' or 'namespaces listing'. In Spark SQL, the operation is  `show tables in hive_prod.db` or  `show namespaces in hive_prod `, it will return corrent result.
   2. Stop HMS ‘_Server1_’ and send the same request. The expected result is the same as step 1, because the '_Server2_' is still active, but the actual result is 'null'.
   3. In general, when one HMS stops and the other is OK,  the client should automatically connect to the active one. However, in Iceberg, the client connect pool does not release failed connections.
   
   ### Analysis
   1. The `ClientPool` implements as follows:
   ```
   public <R> R run(Action<R, C, E> action) throws E, InterruptedException {
       C client = get();
       try {
         return action.run(client);
   
       } catch (Exception exc) {
         if (reconnectExc.isInstance(exc)) {
           try {
             client = reconnect(client);
           } catch (Exception ignored) {
             // if reconnection throws any exception, rethrow the original failure
             throw reconnectExc.cast(exc);
           }
   
           return action.run(client);
         }
   
         throw exc;
   
       } finally {
         release(client);
       }
     }
   
   ```
   
   In `HiveClientPool`, the `reconnectExc` is instance of `TTransportException` which is extends `TException`. Therefore, when the client throw `TTransportException`, it reconnects to the HMS. 
   
   ```
   protected HiveMetaStoreClient reconnect(HiveMetaStoreClient client) {
       try {
         client.close();
         client.reconnect();
       } catch (MetaException e) {
         throw new RuntimeMetaException(e, "Failed to reconnect to Hive Metastore");
       }
       return client;
     }
   ```
    All exceptions  except `TTransportException` do not release the previously established connection.
   
   However, In `HiveMetastoreClient` class,  some methods `getAllTables`, `getAllDatabases` throw the exception of `MetaException` which is extends `TException` when one HMS is inactivation.
   ```
   public List<String> getAllTables(String dbname) throws MetaException {
       try {
         return filterHook.filterTableNames(dbname, client.get_all_tables(dbname));
       } catch (Exception e) {
         MetaStoreUtils.logAndThrowMetaException(e);
       }
       return null;
     }
   
    /**
      * Catches exceptions that can't be handled and bundles them to MetaException
      *
      * @param e
      * @throws MetaException
      */
     static void logAndThrowMetaException(Exception e) throws MetaException {
       String exInfo = "Got exception: " + e.getClass().getName() + " "
           + e.getMessage();
       LOG.error(exInfo, e);
       LOG.error("Converting exception to MetaException");
       throw new MetaException(exInfo);
     }
   ```
   The `MetaException` wrapped the exception of `TTransportException`, which causes the obsolete connection not be released, and returns an error mesage to the user.
   > ERROR - Got exception: org.apache.thrift.transport.TTransportException null
   > org.apache.thrift.transport.TTransportException: null
   > ...
   
   ### Solution
   The solution that can be thought of at present is add a special detection for `MetaException`  which contains error message `org.apache.thrift.transport.TTransportException`. Once a special error message is detected, the current client connection is closed and reconnected.
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Jecarm commented on issue #1994: Failed to connect Hive Metastore using Hive client pool

Posted by GitBox <gi...@apache.org>.
Jecarm commented on issue #1994:
URL: https://github.com/apache/iceberg/issues/1994#issuecomment-757924934


   Sorry for replying so late @pvary .
   Running the liveliness test on every established HMS Client is a solution.  
   I will try to implement it and submit a PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #1994: Failed to connect Hive Metastore using Hive client pool

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #1994:
URL: https://github.com/apache/iceberg/issues/1994#issuecomment-753970623


   Writing checks on exception message strings seem like bad practice to me.
   
   Maybe liveliness check for the connections would be a better solution, where we could check whether the HMS is active or not, by calling a specific low cost method which is not plagued with the exception wrapping you have found above?
   
   What do you think @Jecarm?
   
   Thanks,
   Peter


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #1994: Failed to connect Hive Metastore using Hive client pool

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #1994:
URL: https://github.com/apache/iceberg/issues/1994#issuecomment-757961378


   Thanks @Jecarm!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #1994: Failed to connect Hive Metastore using Hive client pool

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on issue #1994:
URL: https://github.com/apache/iceberg/issues/1994#issuecomment-805502489


   I think the issue has been fixed. Closing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #1994: Failed to connect Hive Metastore using Hive client pool

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #1994:
URL: https://github.com/apache/iceberg/issues/1994#issuecomment-755329968


   Should we run the liveliness test on every established HMS Client? That would ensure that the working clients will not be restarted but the ones connected to a stopped HMS will be stopped and the connection will be reestablished. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on issue #1994: Failed to connect Hive Metastore using Hive client pool

Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #1994:
URL: https://github.com/apache/iceberg/issues/1994#issuecomment-784660095


   Looks like #2119 is in so I'll close this. Thanks, @Jecarm!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary edited a comment on issue #1994: Failed to connect Hive Metastore using Hive client pool

Posted by GitBox <gi...@apache.org>.
pvary edited a comment on issue #1994:
URL: https://github.com/apache/iceberg/issues/1994#issuecomment-753970623


   Writing checks on exception message strings seems like bad practice to me.
   
   Maybe liveliness check for the connections would be a better solution, where we could check whether the HMS is active or not, by calling a specific low cost method which is not plagued with the exception wrapping you have found above?
   
   What do you think @Jecarm?
   
   Thanks,
   Peter


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Jecarm commented on issue #1994: Failed to connect Hive Metastore using Hive client pool

Posted by GitBox <gi...@apache.org>.
Jecarm commented on issue #1994:
URL: https://github.com/apache/iceberg/issues/1994#issuecomment-755133939


   A problem occurs when the client pool has established multiple client connections. The liveliness checking, calling a specific low cost method which is not be wrapped by MetaException, may return OK when connecting to other active HMS connection.   
   How to ensure that the activity detection keeps the same connection with the previous request  @pvary ?
    
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi closed issue #1994: Failed to connect Hive Metastore using Hive client pool

Posted by GitBox <gi...@apache.org>.
aokolnychyi closed issue #1994:
URL: https://github.com/apache/iceberg/issues/1994


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org