You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "AnfengYuan (JIRA)" <ji...@apache.org> on 2017/07/18 07:54:00 UTC

[jira] [Updated] (SPARK-21452) SessionState in HiveClientImpl is never closed

     [ https://issues.apache.org/jira/browse/SPARK-21452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

AnfengYuan updated SPARK-21452:
-------------------------------
    Description: 
When I use beeline to connect to spark thriftserver, there are two SessionState created, but when the beeline connection is closed, only one SessionState is closed, the other will never be closed until the spark thriftserver is shutdown.

Since SessionState created hdfs and local directories, now there are tens of thousands  directories, and I have to delete them manually.

One SessionState is created by HiveSession, and is closed by closeSession() in SessionManager, the other one is created by HiveClientImpl, and I can't see any place where it is closed.

I want to ask how and where can I close this SessionState?

Correct me if I was wrong.

{code:title=spark.log|borderStyle=solid}
17/07/18 15:34:28.759 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V8
17/07/18 15:34:28.772 INFO SessionState: Created local directory: /tmp/288efea6-8cb4-4358-afbd-2663c42bfc6a_resources
17/07/18 15:34:28.774 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/288efea6-8cb4-4358-afbd-2663c42bfc6a
17/07/18 15:34:28.775 INFO SessionState: Created local directory: /tmp/hadoop/288efea6-8cb4-4358-afbd-2663c42bfc6a
17/07/18 15:34:28.777 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/288efea6-8cb4-4358-afbd-2663c42bfc6a/_tmp_space.db
17/07/18 15:34:28.779 INFO HiveSessionImpl: Operation log session directory is created: /tmp/hadoop/operation_logs/288efea6-8cb4-4358-afbd-2663c42bfc6a
17/07/18 15:34:28.862 WARN HiveConf: HiveConf of name hive.mapred.map.tasks.speculative.execution does not exist
17/07/18 15:34:28.862 WARN HiveConf: HiveConf of name hive.server2.thrift.http.max.worker.threads does not exist
17/07/18 15:34:28.868 INFO metastore: Trying to connect to metastore with URI thrift://xxx.xxx.xxx:9083
17/07/18 15:34:28.869 INFO metastore: Connected to metastore.
17/07/18 15:34:28.872 INFO SessionState: Created local directory: /tmp/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d_resources
17/07/18 15:34:28.873 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d
17/07/18 15:34:28.874 INFO SessionState: Created local directory: /tmp/hadoop/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d
17/07/18 15:34:28.876 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d/_tmp_space.db
17/07/18 15:34:28.877 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is hdfs://ns1/user/hive/warehouse
17/07/18 15:34:28.877 INFO SQLStdHiveAccessController: Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=76fe95c0-4698-4a2e-81a0-3fbf2e863d4d, clientType=HIVECLI]
17/07/18 15:34:28.878 INFO metastore: Mestastore configuration hive.metastore.filter.hook changed from org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
{code}

  was:
When I use beeline to connect to spark thriftserver, there are two SessionState created, but when the beeline connection is closed, only one SessionState is closed, the other will never be closed until the spark thriftserver is shutdown.

Since SessionState created hdfs and local directories, now there are tens of thousands  directories, and I have to delete them manually.

One SessionState is created by HiveSession, and is closed by closeSession() in SessionManager, the other one is created by HiveClientImpl, and I can't see any place where it is closed.

I want to ask how and where can I close this SessionState?

Correct me if I was wrong.

{code:title=spark.log|borderStyle=solid}
17/07/18 15:34:28.759 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V8
17/07/18 15:34:28.772 INFO SessionState: Created local directory: /tmp/288efea6-8cb4-4358-afbd-2663c42bfc6a_resources
17/07/18 15:34:28.774 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/288efea6-8cb4-4358-afbd-2663c42bfc6a
17/07/18 15:34:28.775 INFO SessionState: Created local directory: /tmp/hadoop/288efea6-8cb4-4358-afbd-2663c42bfc6a
17/07/18 15:34:28.777 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/288efea6-8cb4-4358-afbd-2663c42bfc6a/_tmp_space.db
17/07/18 15:34:28.779 INFO HiveSessionImpl: Operation log session directory is created: /tmp/hadoop/operation_logs/288efea6-8cb4-4358-afbd-2663c42bfc6a
17/07/18 15:34:28.862 WARN HiveConf: HiveConf of name hive.mapred.map.tasks.speculative.execution does not exist
17/07/18 15:34:28.862 WARN HiveConf: HiveConf of name hive.server2.thrift.http.max.worker.threads does not exist
17/07/18 15:34:28.868 INFO metastore: Trying to connect to metastore with URI thrift://a01-r15-2bb10-i28-160.jd.local:9083
17/07/18 15:34:28.869 INFO metastore: Connected to metastore.
17/07/18 15:34:28.872 INFO SessionState: Created local directory: /tmp/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d_resources
17/07/18 15:34:28.873 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d
17/07/18 15:34:28.874 INFO SessionState: Created local directory: /tmp/hadoop/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d
17/07/18 15:34:28.876 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d/_tmp_space.db
17/07/18 15:34:28.877 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is hdfs://ns1/user/hive/warehouse
17/07/18 15:34:28.877 INFO SQLStdHiveAccessController: Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=76fe95c0-4698-4a2e-81a0-3fbf2e863d4d, clientType=HIVECLI]
17/07/18 15:34:28.878 INFO metastore: Mestastore configuration hive.metastore.filter.hook changed from org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
{code}


> SessionState in HiveClientImpl is never closed
> ----------------------------------------------
>
>                 Key: SPARK-21452
>                 URL: https://issues.apache.org/jira/browse/SPARK-21452
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.2.0
>            Reporter: AnfengYuan
>
> When I use beeline to connect to spark thriftserver, there are two SessionState created, but when the beeline connection is closed, only one SessionState is closed, the other will never be closed until the spark thriftserver is shutdown.
> Since SessionState created hdfs and local directories, now there are tens of thousands  directories, and I have to delete them manually.
> One SessionState is created by HiveSession, and is closed by closeSession() in SessionManager, the other one is created by HiveClientImpl, and I can't see any place where it is closed.
> I want to ask how and where can I close this SessionState?
> Correct me if I was wrong.
> {code:title=spark.log|borderStyle=solid}
> 17/07/18 15:34:28.759 INFO ThriftCLIService: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V8
> 17/07/18 15:34:28.772 INFO SessionState: Created local directory: /tmp/288efea6-8cb4-4358-afbd-2663c42bfc6a_resources
> 17/07/18 15:34:28.774 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/288efea6-8cb4-4358-afbd-2663c42bfc6a
> 17/07/18 15:34:28.775 INFO SessionState: Created local directory: /tmp/hadoop/288efea6-8cb4-4358-afbd-2663c42bfc6a
> 17/07/18 15:34:28.777 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/288efea6-8cb4-4358-afbd-2663c42bfc6a/_tmp_space.db
> 17/07/18 15:34:28.779 INFO HiveSessionImpl: Operation log session directory is created: /tmp/hadoop/operation_logs/288efea6-8cb4-4358-afbd-2663c42bfc6a
> 17/07/18 15:34:28.862 WARN HiveConf: HiveConf of name hive.mapred.map.tasks.speculative.execution does not exist
> 17/07/18 15:34:28.862 WARN HiveConf: HiveConf of name hive.server2.thrift.http.max.worker.threads does not exist
> 17/07/18 15:34:28.868 INFO metastore: Trying to connect to metastore with URI thrift://xxx.xxx.xxx:9083
> 17/07/18 15:34:28.869 INFO metastore: Connected to metastore.
> 17/07/18 15:34:28.872 INFO SessionState: Created local directory: /tmp/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d_resources
> 17/07/18 15:34:28.873 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d
> 17/07/18 15:34:28.874 INFO SessionState: Created local directory: /tmp/hadoop/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d
> 17/07/18 15:34:28.876 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/76fe95c0-4698-4a2e-81a0-3fbf2e863d4d/_tmp_space.db
> 17/07/18 15:34:28.877 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is hdfs://ns1/user/hive/warehouse
> 17/07/18 15:34:28.877 INFO SQLStdHiveAccessController: Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=76fe95c0-4698-4a2e-81a0-3fbf2e863d4d, clientType=HIVECLI]
> 17/07/18 15:34:28.878 INFO metastore: Mestastore configuration hive.metastore.filter.hook changed from org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org