You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/19 00:22:00 UTC
[jira] [Work logged] (HIVE-24404) Hive getUserName close db makes client operations lost metaStoreClient connection

     [ https://issues.apache.org/jira/browse/HIVE-24404?focusedWorklogId=846326&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-846326 ]

ASF GitHub Bot logged work on HIVE-24404:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 19/Feb/23 00:21
            Start Date: 19/Feb/23 00:21
    Worklog Time Spent: 10m 
      Work Description: github-actions[bot] commented on PR #1685:
URL: https://github.com/apache/hive/pull/1685#issuecomment-1435799619

   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.




Issue Time Tracking
-------------------

    Worklog Id:     (was: 846326)
    Time Spent: 2h  (was: 1h 50m)

> Hive getUserName close db makes client operations lost metaStoreClient connection
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-24404
>                 URL: https://issues.apache.org/jira/browse/HIVE-24404
>             Project: Hive
>          Issue Type: Bug
>          Components: Clients
>    Affects Versions: 2.3.7, 3.1.3, 4.0.0-alpha-1
>         Environment: os: centos 7
> spark: 3.0.1
> hive: 2.3.7
>            Reporter: Lichuanliang
>            Assignee: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> I'm using spark to execute a drop partition sql will always encounter a lost metastore connection warning.
>  Spark ql:
> {code:java}
> alter table mydb.some_table drop if exists partition(dt = '2020-11-12',hh = '17');
> {code}
> Execution log:
> {code:java}
> 20/11/12 19:37:57 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.20/11/12 19:37:57 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.20/11/12 19:37:57 WARN RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 1s. listPartitionsWithAuthInfoorg.apache.thrift.transport.TTransportException: Cannot write to null outputStream at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142) at org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:185) at org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:116) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:70) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_partitions_ps_with_auth(ThriftHiveMetastore.java:2562) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_ps_with_auth(ThriftHiveMetastore.java:2549) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsWithAuthInfo(HiveMetaStoreClient.java:1209) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) at com.sun.proxy.$Proxy32.listPartitionsWithAuthInfo(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2336) at com.sun.proxy.$Proxy32.listPartitionsWithAuthInfo(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:2555) at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:2581) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$dropPartitions$2(HiveClientImpl.scala:628) at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245) at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$dropPartitions$1(HiveClientImpl.scala:622) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:294) at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:227) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:226) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:276) at org.apache.spark.sql.hive.client.HiveClientImpl.dropPartitions(HiveClientImpl.scala:617) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$dropPartitions$1(HiveExternalCatalog.scala:1018) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:103) at org.apache.spark.sql.hive.HiveExternalCatalog.dropPartitions(HiveExternalCatalog.scala:1015) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.dropPartitions(ExternalCatalogWithListener.scala:211) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.dropPartitions(SessionCatalog.scala:988) at org.apache.spark.sql.execution.command.AlterTableDropPartitionCommand.run(ddl.scala:581) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:229) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3618) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3616) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:229) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97) at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:607) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:602) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:650) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:64) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:377) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:496) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:490) at scala.collection.Iterator.foreach(Iterator.scala:941) at scala.collection.Iterator.foreach$(Iterator.scala:941) at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:490) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:474) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:490) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:208) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Response codeTime taken: 4.192 seconds
> {code}
> The problem raised from the getPartitions method, where the metastore client getMSC get will later been closed by the getUserName method. The getUserName method will close the current metastore in during setAuth and cause the underlying thrift transport closed.
> {code:java}
> public List<Partition> getPartitions(Table tbl, Map<String, String> partialPartSpec,
>     short limit)
> throws HiveException {
>   if (!tbl.isPartitioned()) {
>     throw new HiveException(ErrorMsg.TABLE_NOT_PARTITIONED, tbl.getTableName());
>   }
>   List<String> partialPvals = MetaStoreUtils.getPvals(tbl.getPartCols(), partialPartSpec);
>   List<org.apache.hadoop.hive.metastore.api.Partition> partitions = null;
>   try {
>     partitions = getMSC().listPartitionsWithAuthInfo(tbl.getDbName(), tbl.getTableName(),
>         partialPvals, limit, getUserName(), getGroupNames());
>   } catch (Exception e) {
>     throw new HiveException(e);
>   }
>   List<Partition> qlPartitions = new ArrayList<Partition>();
>   for (org.apache.hadoop.hive.metastore.api.Partition p : partitions) {
>     qlPartitions.add( new Partition(tbl, p));
>   }
>   return qlPartitions;
> }{code}
> I found another guy have raised the same issue for spark at a older version.
> https://issues.apache.org/jira/browse/SPARK-29409
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)