You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/05/09 20:46:04 UTC

[jira] [Commented] (DRILL-5496) Must restart drillbits whenever a secure Hive metastore is restarted

    [ https://issues.apache.org/jira/browse/DRILL-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003524#comment-16003524 ] 

Paul Rogers commented on DRILL-5496:
------------------------------------

Full stack trace at failure:

{code}
2017-05-01 16:03:00,232 [26f86b8b-c25f-4593-99b6-03f1d927aeee:foreman] WARN  o.a.d.e.s.h.DrillHiveMetaStoreClient - Failure while attempting to get hive databases. Retries once.
org.apache.hadoop.hive.metastore.api.MetaException: Got exception: org.apache.thrift.transport.TTransportException null
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1213) ~[hive-metastore-1.2.0-mapr-1608.jar:1.2.0-mapr-1608]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1033) ~[hive-metastore-1.2.0-mapr-1608.jar:1.2.0-mapr-1608]
        at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient.getDatabasesHelper(DrillHiveMetaStoreClient.java:203) ~[drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$DatabaseLoader.load(DrillHiveMetaStoreClient.java:505) [drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$DatabaseLoader.load(DrillHiveMetaStoreClient.java:498) [drill-storage-hive-core-1.10.0.jar:1.10.0]
        at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527) [guava-18.0.jar:na]
        at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) [guava-18.0.jar:na]
        at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282) [guava-18.0.jar:na]
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) [guava-18.0.jar:na]
        at com.google.common.cache.LocalCache.get(LocalCache.java:3937) [guava-18.0.jar:na]
        at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) [guava-18.0.jar:na]
        at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) [guava-18.0.jar:na]
        at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$HiveClientWithAuthzWithCaching.getDatabases(DrillHiveMetaStoreClient.java:411) [drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.getSubSchema(HiveSchemaFactory.java:139) [drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.<init>(HiveSchemaFactory.java:133) [drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.schema.HiveSchemaFactory.registerSchemas(HiveSchemaFactory.java:118) [drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.HiveStoragePlugin.registerSchemas(HiveStoragePlugin.java:100) [drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.StoragePluginRegistryImpl$DrillSchemaFactory.registerSchemas(StoragePluginRegistryImpl.java:396) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.SchemaTreeProvider.createRootSchema(SchemaTreeProvider.java:110) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.SchemaTreeProvider.createRootSchema(SchemaTreeProvider.java:99) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.ops.QueryContext.getRootSchema(QueryContext.java:163) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.ops.QueryContext.getRootSchema(QueryContext.java:152) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.ops.QueryContext.getNewDefaultSchema(QueryContext.java:138) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.planner.sql.SqlConverter.<init>(SqlConverter.java:110) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:101) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1050) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:281) [drill-java-exec-1.10.0.jar:1.10.0]
{code}

As it turns out, the code already attempts to retry the connection (added by DRILL-4964):

{code}
  protected static List<String> getDatabasesHelper(final IMetaStoreClient mClient) throws TException {
    try {
      return mClient.getAllDatabases();
    } catch (MetaException e) {
      /*
         HiveMetaStoreClient is encapsulating both the MetaException/TExceptions inside MetaException.
         Since we don't have good way to differentiate, we will close older connection and retry once.
         This is only applicable for getAllTables and getAllDatabases method since other methods are
         properly throwing correct exceptions.
      */
      logger.warn("Failure while attempting to get hive databases. Retries once.", e);
      try {
        mClient.close();
      } catch (Exception ex) {
        logger.warn("Failure while attempting to close existing hive metastore connection. May leak connection.", ex);
      }
      mClient.reconnect();
      return mClient.getAllDatabases();
    }
  }
{code}

The log says:

{code}
WARN  o.a.d.e.s.h.DrillHiveMetaStoreClient - Failure while attempting to get hive databases. Retries once.
{code}

So, we got as far as the line that emits the logger line. That is, we caught the exception on the invalid connection and we attempted to retry.

But, the log says:

{code}
DrillHiveMetaStoreClient.getDatabasesHelper(DrillHiveMetaStoreClient.java:203)
{code}

[Line 203|https://github.com/apache/drill/blob/b657d44feb527c8e3d83c9996c9220ec4d50aaf3/contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/DrillHiveMetaStoreClient.java] is the first call to {{mClient.getAllDatabases()}}. This suggests that the retry was not actually done.

Consider the code snippet shown earlier. Stepping through the failure scenario shows that the following line fails:

{code}
      mClient.reconnect();
{code}

Evidently this retry code does not work for a secure connection.

> Must restart drillbits whenever a secure Hive metastore is restarted
> --------------------------------------------------------------------
>
>                 Key: DRILL-5496
>                 URL: https://issues.apache.org/jira/browse/DRILL-5496
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.11.0
>
>
> DRILL-4964: "Drill fails to connect to hive metastore after hive metastore is restarted unless drillbits are restarted also" attempted to fix a bug in Drill in which Drill hangs if Hive is restarted. Now, we see that all subsequent "show schemas" queries fail.
> Steps to repro:
> 1. Build a secure cluster (we used MapR)
> 2. Install Hive and Drill services
> 3. Configure drill impersonation and authentication
> 4. Restart hivemeta service
> 5. Connect to drill and execute query involving hive storage, issue occurs
> 6. Restart the drill-bits services and execute the query, issue is no longer hit
> The problem occurs in the same place as the earlier fix, but might represent a slightly different use case: in this case the connection is secure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)