You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hu Liu, (JIRA)" <ji...@apache.org> on 2017/09/05 06:29:00 UTC

[jira] [Updated] (SPARK-21918) HiveClient shouldn't share Hive object between different thread

     [ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hu Liu, updated SPARK-21918:
----------------------------
    Description: 
I'm testing the spark thrift server and found that all the DDL statements are run by user hive even if hive.server2.enable.doAs=true
The root cause is that Hive object is shared between different thread in HiveClientImpl
{code:java}
  private def client: Hive = {
    if (clientLoader.cachedHive != null) {
      clientLoader.cachedHive.asInstanceOf[Hive]
    } else {
      val c = Hive.get(conf)
      clientLoader.cachedHive = c
      c
    }
  }
{code}
But in impersonation mode, we should just share the Hive object inside the thread so that the  metastore client in Hive could be associated with right user.

we can  pass the Hive object of parent thread to child thread when running the sql to fix it
I have already had a initial patch for review and I'm glad to work on it if anyone could assign it to me.


  was:
I'm testing the spark thrift server and found that all the DDL statements are run by user hive even if hive.server2.enable.doAs=true
The root cause is that Hive object is shared between different thread in HiveClientImpl
{code:java}
  private def client: Hive = {
    if (clientLoader.cachedHive != null) {
      clientLoader.cachedHive.asInstanceOf[Hive]
    } else {
      val c = Hive.get(conf)
      clientLoader.cachedHive = c
      c
    }
  }
{code}
But in impersonation mode, we should just share the Hive object inside the thread.

we can  pass the Hive object of current thread to new thread when running the sql to fix it
I have already had a initial patch for review and I'm glad to work on it if anyone could assign it to me.



> HiveClient shouldn't share Hive object between different thread
> ---------------------------------------------------------------
>
>                 Key: SPARK-21918
>                 URL: https://issues.apache.org/jira/browse/SPARK-21918
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Hu Liu,
>
> I'm testing the spark thrift server and found that all the DDL statements are run by user hive even if hive.server2.enable.doAs=true
> The root cause is that Hive object is shared between different thread in HiveClientImpl
> {code:java}
>   private def client: Hive = {
>     if (clientLoader.cachedHive != null) {
>       clientLoader.cachedHive.asInstanceOf[Hive]
>     } else {
>       val c = Hive.get(conf)
>       clientLoader.cachedHive = c
>       c
>     }
>   }
> {code}
> But in impersonation mode, we should just share the Hive object inside the thread so that the  metastore client in Hive could be associated with right user.
> we can  pass the Hive object of parent thread to child thread when running the sql to fix it
> I have already had a initial patch for review and I'm glad to work on it if anyone could assign it to me.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org