You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by yaooqinn <gi...@git.apache.org> on 2017/07/16 16:03:24 UTC

[GitHub] spark pull request #18648: [SPARK-21428] Set IsolatedClientLoader off while ...

GitHub user yaooqinn opened a pull request:

    https://github.com/apache/spark/pull/18648

    [SPARK-21428] Set IsolatedClientLoader off while using builtin Hive jars for reusing CliSessionState

    ## What changes were proposed in this pull request?
    
    Set isolated to false while using builtin hive jars
    
    ## How was this patch tested?
    
    Manually verified: `hive.exec.strachdir` was only created once because of reusing cliSessionState 
    ```java
    ➜  spark git:(SPARK-21428) ✗ bin/spark-sql --conf spark.sql.hive.metastore.jars=builtin
    
    log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
    log4j:WARN Please initialize the log4j system properly.
    log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    17/07/16 23:59:27 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    17/07/16 23:59:27 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
    17/07/16 23:59:27 INFO ObjectStore: ObjectStore, initialize called
    17/07/16 23:59:28 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
    17/07/16 23:59:28 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
    17/07/16 23:59:29 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
    17/07/16 23:59:30 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
    17/07/16 23:59:30 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
    17/07/16 23:59:31 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
    17/07/16 23:59:31 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
    17/07/16 23:59:31 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
    17/07/16 23:59:31 INFO ObjectStore: Initialized ObjectStore
    17/07/16 23:59:31 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
    17/07/16 23:59:31 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
    17/07/16 23:59:32 INFO HiveMetaStore: Added admin role in metastore
    17/07/16 23:59:32 INFO HiveMetaStore: Added public role in metastore
    17/07/16 23:59:32 INFO HiveMetaStore: No user is added in admin role, since config is empty
    17/07/16 23:59:32 INFO HiveMetaStore: 0: get_all_databases
    17/07/16 23:59:32 INFO audit: ugi=Kent	ip=unknown-ip-addr	cmd=get_all_databases
    17/07/16 23:59:32 INFO HiveMetaStore: 0: get_functions: db=default pat=*
    17/07/16 23:59:32 INFO audit: ugi=Kent	ip=unknown-ip-addr	cmd=get_functions: db=default pat=*
    17/07/16 23:59:32 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
    17/07/16 23:59:32 INFO SessionState: Created local directory: /var/folders/k2/04p4k4ws73l6711h_mz2_tq00000gn/T/beea7261-221a-4711-89e8-8b12a9d37370_resources
    17/07/16 23:59:32 INFO SessionState: Created HDFS directory: /tmp/hive/Kent/beea7261-221a-4711-89e8-8b12a9d37370
    17/07/16 23:59:32 INFO SessionState: Created local directory: /var/folders/k2/04p4k4ws73l6711h_mz2_tq00000gn/T/Kent/beea7261-221a-4711-89e8-8b12a9d37370
    17/07/16 23:59:32 INFO SessionState: Created HDFS directory: /tmp/hive/Kent/beea7261-221a-4711-89e8-8b12a9d37370/_tmp_space.db
    17/07/16 23:59:32 INFO SparkContext: Running Spark version 2.3.0-SNAPSHOT
    17/07/16 23:59:32 INFO SparkContext: Submitted application: SparkSQL::10.0.0.8
    17/07/16 23:59:32 INFO SecurityManager: Changing view acls to: Kent
    17/07/16 23:59:32 INFO SecurityManager: Changing modify acls to: Kent
    17/07/16 23:59:32 INFO SecurityManager: Changing view acls groups to:
    17/07/16 23:59:32 INFO SecurityManager: Changing modify acls groups to:
    17/07/16 23:59:32 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(Kent); groups with view permissions: Set(); users  with modify permissions: Set(Kent); groups with modify permissions: Set()
    17/07/16 23:59:33 INFO Utils: Successfully started service 'sparkDriver' on port 51889.
    17/07/16 23:59:33 INFO SparkEnv: Registering MapOutputTracker
    17/07/16 23:59:33 INFO SparkEnv: Registering BlockManagerMaster
    17/07/16 23:59:33 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
    17/07/16 23:59:33 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
    17/07/16 23:59:33 INFO DiskBlockManager: Created local directory at /private/var/folders/k2/04p4k4ws73l6711h_mz2_tq00000gn/T/blockmgr-9cfae28a-01e9-4c73-a1f1-f76fa52fc7a5
    17/07/16 23:59:33 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
    17/07/16 23:59:33 INFO SparkEnv: Registering OutputCommitCoordinator
    17/07/16 23:59:33 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    17/07/16 23:59:33 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.0.8:4040
    17/07/16 23:59:33 INFO Executor: Starting executor ID driver on host localhost
    17/07/16 23:59:33 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51890.
    17/07/16 23:59:33 INFO NettyBlockTransferService: Server created on 10.0.0.8:51890
    17/07/16 23:59:33 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
    17/07/16 23:59:33 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.0.0.8, 51890, None)
    17/07/16 23:59:33 INFO BlockManagerMasterEndpoint: Registering block manager 10.0.0.8:51890 with 366.3 MB RAM, BlockManagerId(driver, 10.0.0.8, 51890, None)
    17/07/16 23:59:33 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.0.0.8, 51890, None)
    17/07/16 23:59:33 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.0.0.8, 51890, None)
    17/07/16 23:59:34 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/Kent/Documents/spark/spark-warehouse').
    17/07/16 23:59:34 INFO SharedState: Warehouse path is 'file:/Users/Kent/Documents/spark/spark-warehouse'.
    17/07/16 23:59:34 INFO HiveUtils: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
    17/07/16 23:59:34 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.2) is /user/hive/warehouse
    17/07/16 23:59:34 INFO HiveMetaStore: 0: get_database: default
    17/07/16 23:59:34 INFO audit: ugi=Kent	ip=unknown-ip-addr	cmd=get_database: default
    17/07/16 23:59:34 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.2) is /user/hive/warehouse
    17/07/16 23:59:34 INFO HiveMetaStore: 0: get_database: global_temp
    17/07/16 23:59:34 INFO audit: ugi=Kent	ip=unknown-ip-addr	cmd=get_database: global_temp
    17/07/16 23:59:34 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
    17/07/16 23:59:34 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.2) is /user/hive/warehouse
    17/07/16 23:59:34 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
    spark-sql>
    
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yaooqinn/spark SPARK-21428

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18648.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18648
    
----
commit 394b4716daefd76cf99b6eb0f90c37b23cfa12d1
Author: Kent Yao <ya...@hotmail.com>
Date:   2017-07-16T15:52:37Z

    set isolateOn to false while using builtin hive jars

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131807040
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala ---
    @@ -312,7 +323,7 @@ private[spark] object HiveUtils extends Logging {
             hadoopConf = hadoopConf,
             execJars = jars.toSeq,
             config = configurations,
    -        isolationOn = true,
    +        isolationOn = isCliSessionState(),
    --- End diff --
    
    @cloud-fan According to [HiveClientImpl.scala#L140](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L140), the cliSessionState shall be reused. But because of IsolateClientClassloader, [HiveClientImpl.scala#L138](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L138) will be `null`. Then it always goes to the `else` branch to create and start a new `session.SessionState`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    The description is not clear, at least I get understood after diving into the code changes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132080912
  
    --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveCliSessionStateSuite.scala ---
    @@ -0,0 +1,56 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.hive.thriftserver
    +
    +import org.apache.hadoop.hive.cli.CliSessionState
    +import org.apache.hadoop.hive.conf.HiveConf
    +import org.apache.hadoop.hive.ql.session.SessionState
    +
    +import org.apache.spark.{SparkConf, SparkFunSuite}
    +import org.apache.spark.deploy.SparkHadoopUtil
    +import org.apache.spark.sql.hive.HiveUtils
    +
    +class HiveCliSessionStateSuite extends SparkFunSuite {
    +
    +  test("CliSessionState will be reused") {
    +    val hiveConf = new HiveConf(classOf[SessionState])
    +    HiveUtils.newTemporaryConfiguration(useInMemoryDerby = false).foreach {
    +      case (key, value) => hiveConf.set(key, value)
    +    }
    +    val sessionState: SessionState = new CliSessionState(hiveConf)
    +    SessionState.start(sessionState)
    +    val s1 = SessionState.get
    +    val sparkConf = new SparkConf()
    +    val hadoopConf = SparkHadoopUtil.get.newConfiguration(sparkConf)
    +    val s2 = HiveUtils.newClientForMetadata(sparkConf, hadoopConf).getState
    --- End diff --
    
    with IsolateClientClassload, this seems to cause ClassCastException


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131814534
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -269,6 +232,8 @@ private[hive] class HiveClientImpl(
         }
       }
     
    +  override def toString: String = state.toString
    --- End diff --
    
    This is not a reasonable `toString` implementation for `HiveClientImpl`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    ping @jiangxb1987 @cloud-fan again


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18648


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    OK to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132190077
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -105,107 +105,69 @@ private[hive] class HiveClientImpl(
       // Create an internal session state for this HiveClientImpl.
       val state: SessionState = {
         val original = Thread.currentThread().getContextClassLoader
    -    // Switch to the initClassLoader.
    -    Thread.currentThread().setContextClassLoader(initClassLoader)
    -
    -    // Set up kerberos credentials for UserGroupInformation.loginUser within
    -    // current class loader
    -    if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    -      val principalName = sparkConf.get("spark.yarn.principal")
    -      val keytabFileName = sparkConf.get("spark.yarn.keytab")
    -      if (!new File(keytabFileName).exists()) {
    -        throw new SparkException(s"Keytab file: ${keytabFileName}" +
    -          " specified in spark.yarn.keytab does not exist")
    -      } else {
    -        logInfo("Attempting to login to Kerberos" +
    -          s" using principal: ${principalName} and keytab: ${keytabFileName}")
    -        UserGroupInformation.loginUserFromKeytab(principalName, keytabFileName)
    -      }
    -    }
    -
    -    def isCliSessionState(state: SessionState): Boolean = {
    -      var temp: Class[_] = if (state != null) state.getClass else null
    -      var found = false
    -      while (temp != null && !found) {
    -        found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
    -        temp = temp.getSuperclass
    +    if (clientLoader.isolationOn) {
    +      // Switch to the initClassLoader.
    +      Thread.currentThread().setContextClassLoader(initClassLoader)
    --- End diff --
    
    If `SessionState.get()` is None, we should still call `newState()` and init from `initClassLoader`, should we also switch in that case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    ping @jiangxb1987 @cloud-fan anymore suggestions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132155005
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -105,107 +105,69 @@ private[hive] class HiveClientImpl(
       // Create an internal session state for this HiveClientImpl.
       val state: SessionState = {
         val original = Thread.currentThread().getContextClassLoader
    -    // Switch to the initClassLoader.
    -    Thread.currentThread().setContextClassLoader(initClassLoader)
    -
    -    // Set up kerberos credentials for UserGroupInformation.loginUser within
    -    // current class loader
    -    if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    -      val principalName = sparkConf.get("spark.yarn.principal")
    -      val keytabFileName = sparkConf.get("spark.yarn.keytab")
    -      if (!new File(keytabFileName).exists()) {
    -        throw new SparkException(s"Keytab file: ${keytabFileName}" +
    -          " specified in spark.yarn.keytab does not exist")
    -      } else {
    -        logInfo("Attempting to login to Kerberos" +
    -          s" using principal: ${principalName} and keytab: ${keytabFileName}")
    -        UserGroupInformation.loginUserFromKeytab(principalName, keytabFileName)
    -      }
    -    }
    -
    -    def isCliSessionState(state: SessionState): Boolean = {
    -      var temp: Class[_] = if (state != null) state.getClass else null
    -      var found = false
    -      while (temp != null && !found) {
    -        found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
    -        temp = temp.getSuperclass
    +    if (clientLoader.isolationOn) {
    +      // Switch to the initClassLoader.
    +      Thread.currentThread().setContextClassLoader(initClassLoader)
    +      // Set up kerberos credentials for UserGroupInformation.loginUser within
    +      // current class loader
    +      if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    +        val principal = sparkConf.get("spark.yarn.principal")
    +        val keytab = sparkConf.get("spark.yarn.keytab")
    +        SparkHadoopUtil.get.loginUserFromKeytab(principal, keytab)
           }
    -      found
    -    }
    -
    -    val ret = try {
    -      // originState will be created if not exists, will never be null
    -      val originalState = SessionState.get()
    -      if (isCliSessionState(originalState)) {
    -        // In `SparkSQLCLIDriver`, we have already started a `CliSessionState`,
    -        // which contains information like configurations from command line. Later
    -        // we call `SparkSQLEnv.init()` there, which would run into this part again.
    -        // so we should keep `conf` and reuse the existing instance of `CliSessionState`.
    -        originalState
    -      } else {
    -        val hiveConf = new HiveConf(classOf[SessionState])
    -        // 1: we set all confs in the hadoopConf to this hiveConf.
    -        // This hadoopConf contains user settings in Hadoop's core-site.xml file
    -        // and Hive's hive-site.xml file. Note, we load hive-site.xml file manually in
    -        // SharedState and put settings in this hadoopConf instead of relying on HiveConf
    -        // to load user settings. Otherwise, HiveConf's initialize method will override
    -        // settings in the hadoopConf. This issue only shows up when spark.sql.hive.metastore.jars
    -        // is not set to builtin. When spark.sql.hive.metastore.jars is builtin, the classpath
    -        // has hive-site.xml. So, HiveConf will use that to override its default values.
    -        hadoopConf.iterator().asScala.foreach { entry =>
    -          val key = entry.getKey
    -          val value = entry.getValue
    -          if (key.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying Hadoop and Hive config to Hive Conf: $key=xxx")
    -          } else {
    -            logDebug(s"Applying Hadoop and Hive config to Hive Conf: $key=$value")
    -          }
    -          hiveConf.set(key, value)
    -        }
    -        // HiveConf is a Hadoop Configuration, which has a field of classLoader and
    -        // the initial value will be the current thread's context class loader
    -        // (i.e. initClassLoader at here).
    -        // We call initialConf.setClassLoader(initClassLoader) at here to make
    -        // this action explicit.
    -        hiveConf.setClassLoader(initClassLoader)
    -        // 2: we set all spark confs to this hiveConf.
    -        sparkConf.getAll.foreach { case (k, v) =>
    -          if (k.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying Spark config to Hive Conf: $k=xxx")
    -          } else {
    -            logDebug(s"Applying Spark config to Hive Conf: $k=$v")
    -          }
    -          hiveConf.set(k, v)
    -        }
    -        // 3: we set all entries in config to this hiveConf.
    -        extraConfig.foreach { case (k, v) =>
    -          if (k.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying extra config to HiveConf: $k=xxx")
    -          } else {
    -            logDebug(s"Applying extra config to HiveConf: $k=$v")
    -          }
    -          hiveConf.set(k, v)
    -        }
    -        val state = new SessionState(hiveConf)
    -        if (clientLoader.cachedHive != null) {
    -          Hive.set(clientLoader.cachedHive.asInstanceOf[Hive])
    -        }
    -        SessionState.start(state)
    -        state.out = new PrintStream(outputBuffer, true, "UTF-8")
    -        state.err = new PrintStream(outputBuffer, true, "UTF-8")
    -        state
    +      try {
    +        newState()
    +      } finally {
    +        Thread.currentThread().setContextClassLoader(original)
           }
    -    } finally {
    -      Thread.currentThread().setContextClassLoader(original)
    +    } else {
    +      Option(SessionState.get()).getOrElse(newState())
         }
    -    ret
       }
     
       // Log the default warehouse location.
       logInfo(
         s"Warehouse location for Hive client " +
           s"(version ${version.fullVersion}) is ${conf.get("hive.metastore.warehouse.dir")}")
     
    +  private def newState(): SessionState = {
    +    val hiveConf = new HiveConf(classOf[SessionState])
    +    // HiveConf is a Hadoop Configuration, which has a field of classLoader and
    +    // the initial value will be the current thread's context class loader
    +    // (i.e. initClassLoader at here).
    +    // We call initialConf.setClassLoader(initClassLoader) at here to make
    +    // this action explicit.
    +    hiveConf.setClassLoader(initClassLoader)
    +
    +    // 1: Take all from the hadoopConf to this hiveConf.
    +    // This hadoopConf contains user settings in Hadoop's core-site.xml file
    +    // and Hive's hive-site.xml file. Note, we load hive-site.xml file manually in
    +    // SharedState and put settings in this hadoopConf instead of relying on HiveConf
    +    // to load user settings. Otherwise, HiveConf's initialize method will override
    +    // settings in the hadoopConf. This issue only shows up when spark.sql.hive.metastore.jars
    +    // is not set to builtin. When spark.sql.hive.metastore.jars is builtin, the classpath
    +    // has hive-site.xml. So, HiveConf will use that to override its default values.
    +    // 2: we set all spark confs to this hiveConf.
    +    // 3: we set all entries in config to this hiveConf.
    +    (hadoopConf.iterator().asScala.map(kv => kv.getKey -> kv.getValue)
    +      ++ sparkConf.getAll.toMap ++ extraConfig).foreach { case (k, v) =>
    +      if (k.toLowerCase(Locale.ROOT).contains("password")) {
    +        logDebug(s"Applying Spark config to Hive Conf: $k=xxx")
    --- End diff --
    
    This may also be Hadoop/Hive or extra config.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131886506
  
    --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveCliSessionStateSuite.scala ---
    @@ -0,0 +1,56 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.hive.thriftserver
    +
    +import org.apache.hadoop.hive.cli.CliSessionState
    +import org.apache.hadoop.hive.conf.HiveConf
    +import org.apache.hadoop.hive.ql.session.SessionState
    +
    +import org.apache.spark.{SparkConf, SparkFunSuite}
    +import org.apache.spark.deploy.SparkHadoopUtil
    +import org.apache.spark.sql.hive.HiveUtils
    +
    +class HiveCliSessionStateSuite extends SparkFunSuite {
    +
    +  test("CliSessionState will be reused") {
    +    val hiveConf = new HiveConf(classOf[SessionState])
    +    HiveUtils.newTemporaryConfiguration(useInMemoryDerby = false).foreach {
    +      case (key, value) => hiveConf.set(key, value)
    +    }
    +    val sessionState: SessionState = new CliSessionState(hiveConf)
    +    SessionState.start(sessionState)
    +    val s1 = SessionState.get
    +    val sparkConf = new SparkConf()
    +    val hadoopConf = SparkHadoopUtil.get.newConfiguration(sparkConf)
    +    val s2 = HiveUtils.newClientForMetadata(sparkConf, hadoopConf).getState
    --- End diff --
    
    how about `HiveUtils.newClientForMetadata(sparkConf, hadoopConf).asInstanceOf[HiveClientImpl].state`? then we don't need to add `getState`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    **[Test build #80448 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80448/testReport)** for PR 18648 at commit [`51fac11`](https://github.com/apache/spark/commit/51fac11abf4726e2a67426e00104886f1072525c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131703828
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala ---
    @@ -312,7 +323,7 @@ private[spark] object HiveUtils extends Logging {
             hadoopConf = hadoopConf,
             execJars = jars.toSeq,
             config = configurations,
    -        isolationOn = true,
    +        isolationOn = isCliSessionState(),
    --- End diff --
    
    can you explain more about this? Why do we need to do this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132201166
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -105,107 +105,69 @@ private[hive] class HiveClientImpl(
       // Create an internal session state for this HiveClientImpl.
       val state: SessionState = {
         val original = Thread.currentThread().getContextClassLoader
    -    // Switch to the initClassLoader.
    -    Thread.currentThread().setContextClassLoader(initClassLoader)
    -
    -    // Set up kerberos credentials for UserGroupInformation.loginUser within
    -    // current class loader
    -    if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    -      val principalName = sparkConf.get("spark.yarn.principal")
    -      val keytabFileName = sparkConf.get("spark.yarn.keytab")
    -      if (!new File(keytabFileName).exists()) {
    -        throw new SparkException(s"Keytab file: ${keytabFileName}" +
    -          " specified in spark.yarn.keytab does not exist")
    -      } else {
    -        logInfo("Attempting to login to Kerberos" +
    -          s" using principal: ${principalName} and keytab: ${keytabFileName}")
    -        UserGroupInformation.loginUserFromKeytab(principalName, keytabFileName)
    -      }
    -    }
    -
    -    def isCliSessionState(state: SessionState): Boolean = {
    -      var temp: Class[_] = if (state != null) state.getClass else null
    -      var found = false
    -      while (temp != null && !found) {
    -        found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
    -        temp = temp.getSuperclass
    +    if (clientLoader.isolationOn) {
    +      // Switch to the initClassLoader.
    +      Thread.currentThread().setContextClassLoader(initClassLoader)
    --- End diff --
    
    A user app new an CliSessionState instance with built in hive jars to trigger isolate off, then it detach this state, and then new a hive client again, this time isolate off and `SessionState.get() ` will be None, `newState()` will be called without change classloader, I think this is OK, because we never create a isolate class loader from beginning to end.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132195399
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -105,107 +105,69 @@ private[hive] class HiveClientImpl(
       // Create an internal session state for this HiveClientImpl.
       val state: SessionState = {
         val original = Thread.currentThread().getContextClassLoader
    -    // Switch to the initClassLoader.
    -    Thread.currentThread().setContextClassLoader(initClassLoader)
    -
    -    // Set up kerberos credentials for UserGroupInformation.loginUser within
    -    // current class loader
    -    if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    -      val principalName = sparkConf.get("spark.yarn.principal")
    -      val keytabFileName = sparkConf.get("spark.yarn.keytab")
    -      if (!new File(keytabFileName).exists()) {
    -        throw new SparkException(s"Keytab file: ${keytabFileName}" +
    -          " specified in spark.yarn.keytab does not exist")
    -      } else {
    -        logInfo("Attempting to login to Kerberos" +
    -          s" using principal: ${principalName} and keytab: ${keytabFileName}")
    -        UserGroupInformation.loginUserFromKeytab(principalName, keytabFileName)
    -      }
    -    }
    -
    -    def isCliSessionState(state: SessionState): Boolean = {
    -      var temp: Class[_] = if (state != null) state.getClass else null
    -      var found = false
    -      while (temp != null && !found) {
    -        found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
    -        temp = temp.getSuperclass
    +    if (clientLoader.isolationOn) {
    +      // Switch to the initClassLoader.
    +      Thread.currentThread().setContextClassLoader(initClassLoader)
    --- End diff --
    
    If `SessionState.get` be null, then the IsolateOn will be turned on always. Only if we call `SessionState.detachSession`, will this happens?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    ping @gatorsmile could you help to review this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    **[Test build #80458 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80458/testReport)** for PR 18648 at commit [`c50a32f`](https://github.com/apache/spark/commit/c50a32f0476d27ad7aed6e972ba9b6ffb5263aea).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80460/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    **[Test build #80460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80460/testReport)** for PR 18648 at commit [`c6ed2d7`](https://github.com/apache/spark/commit/c6ed2d7d0c5a8f3dc1e09f0a7d9991ede4a60a76).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    @jiangxb1987 Could this pr be merged?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132189977
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala ---
    @@ -136,8 +136,15 @@ class SparkHadoopUtil extends Logging {
     
       def getSecretKeyFromUserCredentials(key: String): Array[Byte] = { null }
     
    -  def loginUserFromKeytab(principalName: String, keytabFilename: String) {
    -    UserGroupInformation.loginUserFromKeytab(principalName, keytabFilename)
    +  def loginUserFromKeytab(principalName: String, keytabFilename: String): Unit = {
    +    if (!new File(keytabFilename).exists()) {
    +      throw new SparkException(s"Keytab file: ${keytabFilename}" +
    +        " specified in spark.yarn.keytab does not exist")
    --- End diff --
    
    ok,notice that


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131829415
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -269,6 +232,8 @@ private[hive] class HiveClientImpl(
         }
       }
     
    +  override def toString: String = state.toString
    --- End diff --
    
    May i add `def getState(): SessionState` to `HiveClientImpl`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131814291
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -581,17 +581,15 @@ object SparkSubmit extends CommandLineUtils {
         if (clusterManager == YARN || clusterManager == LOCAL) {
           if (args.principal != null) {
             require(args.keytab != null, "Keytab must be specified when principal is specified")
    -        if (!new File(args.keytab).exists()) {
    -          throw new SparkException(s"Keytab file: ${args.keytab} does not exist")
    -        } else {
    -          // Add keytab and principal configurations in sysProps to make them available
    -          // for later use; e.g. in spark sql, the isolated class loader used to talk
    -          // to HiveMetastore will use these settings. They will be set as Java system
    -          // properties and then loaded by SparkConf
    -          sysProps.put("spark.yarn.keytab", args.keytab)
    -          sysProps.put("spark.yarn.principal", args.principal)
    -
    -          UserGroupInformation.loginUserFromKeytab(args.principal, args.keytab)
    +        Try { SparkHadoopUtil.get.loginUserFromKeytab(args.principal, args.keytab)} match {
    +          case Success(_) =>
    +            // Add keytab and principal configurations in sysProps to make them available
    +            // for later use; e.g. in spark sql, the isolated class loader used to talk
    +            // to HiveMetastore will use these settings. They will be set as Java system
    +            // properties and then loaded by SparkConf
    +            sysProps.put("spark.yarn.keytab", args.keytab)
    +            sysProps.put("spark.yarn.principal", args.principal)
    +          case Failure(exception) => throw exception
    --- End diff --
    
    we can just write
    ```
    SparkHadoopUtil.get.loginUserFromKeytab(args.principal, args.keytab)
    // the comments ...
    sysProps.put("spark.yarn.keytab", args.keytab)
    sysProps.put("spark.yarn.principal", args.principal)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131829565
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -581,17 +581,15 @@ object SparkSubmit extends CommandLineUtils {
         if (clusterManager == YARN || clusterManager == LOCAL) {
           if (args.principal != null) {
             require(args.keytab != null, "Keytab must be specified when principal is specified")
    -        if (!new File(args.keytab).exists()) {
    -          throw new SparkException(s"Keytab file: ${args.keytab} does not exist")
    -        } else {
    -          // Add keytab and principal configurations in sysProps to make them available
    -          // for later use; e.g. in spark sql, the isolated class loader used to talk
    -          // to HiveMetastore will use these settings. They will be set as Java system
    -          // properties and then loaded by SparkConf
    -          sysProps.put("spark.yarn.keytab", args.keytab)
    -          sysProps.put("spark.yarn.principal", args.principal)
    -
    -          UserGroupInformation.loginUserFromKeytab(args.principal, args.keytab)
    +        Try { SparkHadoopUtil.get.loginUserFromKeytab(args.principal, args.keytab)} match {
    +          case Success(_) =>
    +            // Add keytab and principal configurations in sysProps to make them available
    +            // for later use; e.g. in spark sql, the isolated class loader used to talk
    +            // to HiveMetastore will use these settings. They will be set as Java system
    +            // properties and then loaded by SparkConf
    +            sysProps.put("spark.yarn.keytab", args.keytab)
    +            sysProps.put("spark.yarn.principal", args.principal)
    +          case Failure(exception) => throw exception
    --- End diff --
    
    OK


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131830784
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -269,6 +232,8 @@ private[hive] class HiveClientImpl(
         }
       }
     
    +  override def toString: String = state.toString
    --- End diff --
    
    ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    **[Test build #80448 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80448/testReport)** for PR 18648 at commit [`51fac11`](https://github.com/apache/spark/commit/51fac11abf4726e2a67426e00104886f1072525c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131851577
  
    --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveCliSessionStateSuite.scala ---
    @@ -0,0 +1,57 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.hive.thriftserver
    +
    +import org.apache.hadoop.hive.cli.CliSessionState
    +import org.apache.hadoop.hive.conf.HiveConf
    +import org.apache.hadoop.hive.ql.session.SessionState
    +
    +import org.apache.spark.{SparkConf, SparkFunSuite}
    +import org.apache.spark.deploy.SparkHadoopUtil
    +import org.apache.spark.sql.hive.HiveUtils
    +
    +class HiveCliSessionStateSuite extends SparkFunSuite {
    +
    +  test("CliSessionState will be reused") {
    +    val hiveConf = new HiveConf(classOf[SessionState])
    +    HiveUtils.newTemporaryConfiguration(useInMemoryDerby = false).foreach {
    +      case (key, value) => hiveConf.set(key, value)
    +    }
    +    val sessionState: SessionState = new CliSessionState(hiveConf)
    +    doTest(sessionState, true)
    +  }
    +
    +  test("SessionState will not be reused") {
    +    val hiveConf = new HiveConf(classOf[SessionState])
    +    HiveUtils.newTemporaryConfiguration(useInMemoryDerby = false).foreach {
    +      case (key, value) => hiveConf.set(key, value)
    +    }
    +    val sessionState: SessionState = new SessionState(hiveConf)
    +    doTest(sessionState, false)
    +  }
    +
    +  def doTest(state: SessionState, expected: Boolean): Unit = {
    +    SessionState.start(state)
    +    val s1 = SessionState.get
    +    val sparkConf = new SparkConf()
    +    val hadoopConf = SparkHadoopUtil.get.newConfiguration(sparkConf)
    +    val hiveClient = HiveUtils.newClientForMetadata(sparkConf, hadoopConf)
    +    assert((hiveClient.toString == s1.toString) === expected)
    --- End diff --
    
    updated


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132150474
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala ---
    @@ -136,8 +136,15 @@ class SparkHadoopUtil extends Logging {
     
       def getSecretKeyFromUserCredentials(key: String): Array[Byte] = { null }
     
    -  def loginUserFromKeytab(principalName: String, keytabFilename: String) {
    -    UserGroupInformation.loginUserFromKeytab(principalName, keytabFilename)
    +  def loginUserFromKeytab(principalName: String, keytabFilename: String): Unit = {
    +    if (!new File(keytabFilename).exists()) {
    +      throw new SparkException(s"Keytab file: ${keytabFilename}" +
    +        " specified in spark.yarn.keytab does not exist")
    --- End diff --
    
    nit: To be general, let's not mention the config name `spark.yarn.keytab` here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132156338
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala ---
    @@ -229,6 +230,17 @@ private[spark] object HiveUtils extends Logging {
         }.toMap
       }
     
    +  def isCliSessionState(): Boolean = {
    --- End diff --
    
    nit: Should add comment for this method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132206147
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -105,107 +105,69 @@ private[hive] class HiveClientImpl(
       // Create an internal session state for this HiveClientImpl.
       val state: SessionState = {
         val original = Thread.currentThread().getContextClassLoader
    -    // Switch to the initClassLoader.
    -    Thread.currentThread().setContextClassLoader(initClassLoader)
    -
    -    // Set up kerberos credentials for UserGroupInformation.loginUser within
    -    // current class loader
    -    if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    -      val principalName = sparkConf.get("spark.yarn.principal")
    -      val keytabFileName = sparkConf.get("spark.yarn.keytab")
    -      if (!new File(keytabFileName).exists()) {
    -        throw new SparkException(s"Keytab file: ${keytabFileName}" +
    -          " specified in spark.yarn.keytab does not exist")
    -      } else {
    -        logInfo("Attempting to login to Kerberos" +
    -          s" using principal: ${principalName} and keytab: ${keytabFileName}")
    -        UserGroupInformation.loginUserFromKeytab(principalName, keytabFileName)
    -      }
    -    }
    -
    -    def isCliSessionState(state: SessionState): Boolean = {
    -      var temp: Class[_] = if (state != null) state.getClass else null
    -      var found = false
    -      while (temp != null && !found) {
    -        found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
    -        temp = temp.getSuperclass
    +    if (clientLoader.isolationOn) {
    +      // Switch to the initClassLoader.
    +      Thread.currentThread().setContextClassLoader(initClassLoader)
    +      // Set up kerberos credentials for UserGroupInformation.loginUser within
    +      // current class loader
    +      if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    +        val principal = sparkConf.get("spark.yarn.principal")
    +        val keytab = sparkConf.get("spark.yarn.keytab")
    +        SparkHadoopUtil.get.loginUserFromKeytab(principal, keytab)
           }
    -      found
    -    }
    -
    -    val ret = try {
    -      // originState will be created if not exists, will never be null
    -      val originalState = SessionState.get()
    -      if (isCliSessionState(originalState)) {
    -        // In `SparkSQLCLIDriver`, we have already started a `CliSessionState`,
    -        // which contains information like configurations from command line. Later
    -        // we call `SparkSQLEnv.init()` there, which would run into this part again.
    -        // so we should keep `conf` and reuse the existing instance of `CliSessionState`.
    -        originalState
    -      } else {
    -        val hiveConf = new HiveConf(classOf[SessionState])
    -        // 1: we set all confs in the hadoopConf to this hiveConf.
    -        // This hadoopConf contains user settings in Hadoop's core-site.xml file
    -        // and Hive's hive-site.xml file. Note, we load hive-site.xml file manually in
    -        // SharedState and put settings in this hadoopConf instead of relying on HiveConf
    -        // to load user settings. Otherwise, HiveConf's initialize method will override
    -        // settings in the hadoopConf. This issue only shows up when spark.sql.hive.metastore.jars
    -        // is not set to builtin. When spark.sql.hive.metastore.jars is builtin, the classpath
    -        // has hive-site.xml. So, HiveConf will use that to override its default values.
    -        hadoopConf.iterator().asScala.foreach { entry =>
    -          val key = entry.getKey
    -          val value = entry.getValue
    -          if (key.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying Hadoop and Hive config to Hive Conf: $key=xxx")
    -          } else {
    -            logDebug(s"Applying Hadoop and Hive config to Hive Conf: $key=$value")
    -          }
    -          hiveConf.set(key, value)
    -        }
    -        // HiveConf is a Hadoop Configuration, which has a field of classLoader and
    -        // the initial value will be the current thread's context class loader
    -        // (i.e. initClassLoader at here).
    -        // We call initialConf.setClassLoader(initClassLoader) at here to make
    -        // this action explicit.
    -        hiveConf.setClassLoader(initClassLoader)
    -        // 2: we set all spark confs to this hiveConf.
    -        sparkConf.getAll.foreach { case (k, v) =>
    -          if (k.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying Spark config to Hive Conf: $k=xxx")
    -          } else {
    -            logDebug(s"Applying Spark config to Hive Conf: $k=$v")
    -          }
    -          hiveConf.set(k, v)
    -        }
    -        // 3: we set all entries in config to this hiveConf.
    -        extraConfig.foreach { case (k, v) =>
    -          if (k.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying extra config to HiveConf: $k=xxx")
    -          } else {
    -            logDebug(s"Applying extra config to HiveConf: $k=$v")
    -          }
    -          hiveConf.set(k, v)
    -        }
    -        val state = new SessionState(hiveConf)
    -        if (clientLoader.cachedHive != null) {
    -          Hive.set(clientLoader.cachedHive.asInstanceOf[Hive])
    -        }
    -        SessionState.start(state)
    -        state.out = new PrintStream(outputBuffer, true, "UTF-8")
    -        state.err = new PrintStream(outputBuffer, true, "UTF-8")
    -        state
    +      try {
    +        newState()
    +      } finally {
    +        Thread.currentThread().setContextClassLoader(original)
           }
    -    } finally {
    -      Thread.currentThread().setContextClassLoader(original)
    +    } else {
    +      Option(SessionState.get()).getOrElse(newState())
    --- End diff --
    
    In the condition I mentioned above, I think this should be kept


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132203185
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -105,107 +105,69 @@ private[hive] class HiveClientImpl(
       // Create an internal session state for this HiveClientImpl.
       val state: SessionState = {
         val original = Thread.currentThread().getContextClassLoader
    -    // Switch to the initClassLoader.
    -    Thread.currentThread().setContextClassLoader(initClassLoader)
    -
    -    // Set up kerberos credentials for UserGroupInformation.loginUser within
    -    // current class loader
    -    if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    -      val principalName = sparkConf.get("spark.yarn.principal")
    -      val keytabFileName = sparkConf.get("spark.yarn.keytab")
    -      if (!new File(keytabFileName).exists()) {
    -        throw new SparkException(s"Keytab file: ${keytabFileName}" +
    -          " specified in spark.yarn.keytab does not exist")
    -      } else {
    -        logInfo("Attempting to login to Kerberos" +
    -          s" using principal: ${principalName} and keytab: ${keytabFileName}")
    -        UserGroupInformation.loginUserFromKeytab(principalName, keytabFileName)
    -      }
    -    }
    -
    -    def isCliSessionState(state: SessionState): Boolean = {
    -      var temp: Class[_] = if (state != null) state.getClass else null
    -      var found = false
    -      while (temp != null && !found) {
    -        found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
    -        temp = temp.getSuperclass
    +    if (clientLoader.isolationOn) {
    +      // Switch to the initClassLoader.
    +      Thread.currentThread().setContextClassLoader(initClassLoader)
    +      // Set up kerberos credentials for UserGroupInformation.loginUser within
    +      // current class loader
    +      if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    +        val principal = sparkConf.get("spark.yarn.principal")
    +        val keytab = sparkConf.get("spark.yarn.keytab")
    +        SparkHadoopUtil.get.loginUserFromKeytab(principal, keytab)
           }
    -      found
    -    }
    -
    -    val ret = try {
    -      // originState will be created if not exists, will never be null
    -      val originalState = SessionState.get()
    -      if (isCliSessionState(originalState)) {
    -        // In `SparkSQLCLIDriver`, we have already started a `CliSessionState`,
    -        // which contains information like configurations from command line. Later
    -        // we call `SparkSQLEnv.init()` there, which would run into this part again.
    -        // so we should keep `conf` and reuse the existing instance of `CliSessionState`.
    -        originalState
    -      } else {
    -        val hiveConf = new HiveConf(classOf[SessionState])
    -        // 1: we set all confs in the hadoopConf to this hiveConf.
    -        // This hadoopConf contains user settings in Hadoop's core-site.xml file
    -        // and Hive's hive-site.xml file. Note, we load hive-site.xml file manually in
    -        // SharedState and put settings in this hadoopConf instead of relying on HiveConf
    -        // to load user settings. Otherwise, HiveConf's initialize method will override
    -        // settings in the hadoopConf. This issue only shows up when spark.sql.hive.metastore.jars
    -        // is not set to builtin. When spark.sql.hive.metastore.jars is builtin, the classpath
    -        // has hive-site.xml. So, HiveConf will use that to override its default values.
    -        hadoopConf.iterator().asScala.foreach { entry =>
    -          val key = entry.getKey
    -          val value = entry.getValue
    -          if (key.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying Hadoop and Hive config to Hive Conf: $key=xxx")
    -          } else {
    -            logDebug(s"Applying Hadoop and Hive config to Hive Conf: $key=$value")
    -          }
    -          hiveConf.set(key, value)
    -        }
    -        // HiveConf is a Hadoop Configuration, which has a field of classLoader and
    -        // the initial value will be the current thread's context class loader
    -        // (i.e. initClassLoader at here).
    -        // We call initialConf.setClassLoader(initClassLoader) at here to make
    -        // this action explicit.
    -        hiveConf.setClassLoader(initClassLoader)
    -        // 2: we set all spark confs to this hiveConf.
    -        sparkConf.getAll.foreach { case (k, v) =>
    -          if (k.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying Spark config to Hive Conf: $k=xxx")
    -          } else {
    -            logDebug(s"Applying Spark config to Hive Conf: $k=$v")
    -          }
    -          hiveConf.set(k, v)
    -        }
    -        // 3: we set all entries in config to this hiveConf.
    -        extraConfig.foreach { case (k, v) =>
    -          if (k.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying extra config to HiveConf: $k=xxx")
    -          } else {
    -            logDebug(s"Applying extra config to HiveConf: $k=$v")
    -          }
    -          hiveConf.set(k, v)
    -        }
    -        val state = new SessionState(hiveConf)
    -        if (clientLoader.cachedHive != null) {
    -          Hive.set(clientLoader.cachedHive.asInstanceOf[Hive])
    -        }
    -        SessionState.start(state)
    -        state.out = new PrintStream(outputBuffer, true, "UTF-8")
    -        state.err = new PrintStream(outputBuffer, true, "UTF-8")
    -        state
    +      try {
    +        newState()
    +      } finally {
    +        Thread.currentThread().setContextClassLoader(original)
           }
    -    } finally {
    -      Thread.currentThread().setContextClassLoader(original)
    +    } else {
    +      Option(SessionState.get()).getOrElse(newState())
    --- End diff --
    
    Since `SessionState.get()` won't be None here, we can simplify the code to `SessionState.get()`, and add comment above to issue the reason of doing this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80442/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    **[Test build #80458 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80458/testReport)** for PR 18648 at commit [`c50a32f`](https://github.com/apache/spark/commit/c50a32f0476d27ad7aed6e972ba9b6ffb5263aea).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    **[Test build #80442 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80442/testReport)** for PR 18648 at commit [`6c0bf70`](https://github.com/apache/spark/commit/6c0bf709f95dec3ee3bce3e905e51f31fc5b0e64).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80458/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Set IsolatedClientLoader off while using b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    LGTM, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132189055
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -105,107 +105,69 @@ private[hive] class HiveClientImpl(
       // Create an internal session state for this HiveClientImpl.
       val state: SessionState = {
         val original = Thread.currentThread().getContextClassLoader
    -    // Switch to the initClassLoader.
    -    Thread.currentThread().setContextClassLoader(initClassLoader)
    -
    -    // Set up kerberos credentials for UserGroupInformation.loginUser within
    -    // current class loader
    -    if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    -      val principalName = sparkConf.get("spark.yarn.principal")
    -      val keytabFileName = sparkConf.get("spark.yarn.keytab")
    -      if (!new File(keytabFileName).exists()) {
    -        throw new SparkException(s"Keytab file: ${keytabFileName}" +
    -          " specified in spark.yarn.keytab does not exist")
    -      } else {
    -        logInfo("Attempting to login to Kerberos" +
    -          s" using principal: ${principalName} and keytab: ${keytabFileName}")
    -        UserGroupInformation.loginUserFromKeytab(principalName, keytabFileName)
    -      }
    -    }
    -
    -    def isCliSessionState(state: SessionState): Boolean = {
    -      var temp: Class[_] = if (state != null) state.getClass else null
    -      var found = false
    -      while (temp != null && !found) {
    -        found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
    -        temp = temp.getSuperclass
    +    if (clientLoader.isolationOn) {
    +      // Switch to the initClassLoader.
    +      Thread.currentThread().setContextClassLoader(initClassLoader)
    +      // Set up kerberos credentials for UserGroupInformation.loginUser within
    +      // current class loader
    +      if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    +        val principal = sparkConf.get("spark.yarn.principal")
    +        val keytab = sparkConf.get("spark.yarn.keytab")
    +        SparkHadoopUtil.get.loginUserFromKeytab(principal, keytab)
           }
    -      found
    -    }
    -
    -    val ret = try {
    -      // originState will be created if not exists, will never be null
    -      val originalState = SessionState.get()
    -      if (isCliSessionState(originalState)) {
    -        // In `SparkSQLCLIDriver`, we have already started a `CliSessionState`,
    -        // which contains information like configurations from command line. Later
    -        // we call `SparkSQLEnv.init()` there, which would run into this part again.
    -        // so we should keep `conf` and reuse the existing instance of `CliSessionState`.
    -        originalState
    -      } else {
    -        val hiveConf = new HiveConf(classOf[SessionState])
    -        // 1: we set all confs in the hadoopConf to this hiveConf.
    -        // This hadoopConf contains user settings in Hadoop's core-site.xml file
    -        // and Hive's hive-site.xml file. Note, we load hive-site.xml file manually in
    -        // SharedState and put settings in this hadoopConf instead of relying on HiveConf
    -        // to load user settings. Otherwise, HiveConf's initialize method will override
    -        // settings in the hadoopConf. This issue only shows up when spark.sql.hive.metastore.jars
    -        // is not set to builtin. When spark.sql.hive.metastore.jars is builtin, the classpath
    -        // has hive-site.xml. So, HiveConf will use that to override its default values.
    -        hadoopConf.iterator().asScala.foreach { entry =>
    -          val key = entry.getKey
    -          val value = entry.getValue
    -          if (key.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying Hadoop and Hive config to Hive Conf: $key=xxx")
    -          } else {
    -            logDebug(s"Applying Hadoop and Hive config to Hive Conf: $key=$value")
    -          }
    -          hiveConf.set(key, value)
    -        }
    -        // HiveConf is a Hadoop Configuration, which has a field of classLoader and
    -        // the initial value will be the current thread's context class loader
    -        // (i.e. initClassLoader at here).
    -        // We call initialConf.setClassLoader(initClassLoader) at here to make
    -        // this action explicit.
    -        hiveConf.setClassLoader(initClassLoader)
    -        // 2: we set all spark confs to this hiveConf.
    -        sparkConf.getAll.foreach { case (k, v) =>
    -          if (k.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying Spark config to Hive Conf: $k=xxx")
    -          } else {
    -            logDebug(s"Applying Spark config to Hive Conf: $k=$v")
    -          }
    -          hiveConf.set(k, v)
    -        }
    -        // 3: we set all entries in config to this hiveConf.
    -        extraConfig.foreach { case (k, v) =>
    -          if (k.toLowerCase(Locale.ROOT).contains("password")) {
    -            logDebug(s"Applying extra config to HiveConf: $k=xxx")
    -          } else {
    -            logDebug(s"Applying extra config to HiveConf: $k=$v")
    -          }
    -          hiveConf.set(k, v)
    -        }
    -        val state = new SessionState(hiveConf)
    -        if (clientLoader.cachedHive != null) {
    -          Hive.set(clientLoader.cachedHive.asInstanceOf[Hive])
    -        }
    -        SessionState.start(state)
    -        state.out = new PrintStream(outputBuffer, true, "UTF-8")
    -        state.err = new PrintStream(outputBuffer, true, "UTF-8")
    -        state
    +      try {
    +        newState()
    +      } finally {
    +        Thread.currentThread().setContextClassLoader(original)
           }
    -    } finally {
    -      Thread.currentThread().setContextClassLoader(original)
    +    } else {
    +      Option(SessionState.get()).getOrElse(newState())
         }
    -    ret
       }
     
       // Log the default warehouse location.
       logInfo(
         s"Warehouse location for Hive client " +
           s"(version ${version.fullVersion}) is ${conf.get("hive.metastore.warehouse.dir")}")
     
    +  private def newState(): SessionState = {
    +    val hiveConf = new HiveConf(classOf[SessionState])
    +    // HiveConf is a Hadoop Configuration, which has a field of classLoader and
    +    // the initial value will be the current thread's context class loader
    +    // (i.e. initClassLoader at here).
    +    // We call initialConf.setClassLoader(initClassLoader) at here to make
    +    // this action explicit.
    +    hiveConf.setClassLoader(initClassLoader)
    +
    +    // 1: Take all from the hadoopConf to this hiveConf.
    +    // This hadoopConf contains user settings in Hadoop's core-site.xml file
    +    // and Hive's hive-site.xml file. Note, we load hive-site.xml file manually in
    +    // SharedState and put settings in this hadoopConf instead of relying on HiveConf
    +    // to load user settings. Otherwise, HiveConf's initialize method will override
    +    // settings in the hadoopConf. This issue only shows up when spark.sql.hive.metastore.jars
    +    // is not set to builtin. When spark.sql.hive.metastore.jars is builtin, the classpath
    +    // has hive-site.xml. So, HiveConf will use that to override its default values.
    +    // 2: we set all spark confs to this hiveConf.
    +    // 3: we set all entries in config to this hiveConf.
    +    (hadoopConf.iterator().asScala.map(kv => kv.getKey -> kv.getValue)
    +      ++ sparkConf.getAll.toMap ++ extraConfig).foreach { case (k, v) =>
    +      if (k.toLowerCase(Locale.ROOT).contains("password")) {
    +        logDebug(s"Applying Spark config to Hive Conf: $k=xxx")
    --- End diff --
    
    ok, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132093341
  
    --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveCliSessionStateSuite.scala ---
    @@ -0,0 +1,56 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.hive.thriftserver
    +
    +import org.apache.hadoop.hive.cli.CliSessionState
    +import org.apache.hadoop.hive.conf.HiveConf
    +import org.apache.hadoop.hive.ql.session.SessionState
    +
    +import org.apache.spark.{SparkConf, SparkFunSuite}
    +import org.apache.spark.deploy.SparkHadoopUtil
    +import org.apache.spark.sql.hive.HiveUtils
    +
    +class HiveCliSessionStateSuite extends SparkFunSuite {
    +
    +  test("CliSessionState will be reused") {
    +    val hiveConf = new HiveConf(classOf[SessionState])
    +    HiveUtils.newTemporaryConfiguration(useInMemoryDerby = false).foreach {
    +      case (key, value) => hiveConf.set(key, value)
    +    }
    +    val sessionState: SessionState = new CliSessionState(hiveConf)
    +    SessionState.start(sessionState)
    +    val s1 = SessionState.get
    +    val sparkConf = new SparkConf()
    +    val hadoopConf = SparkHadoopUtil.get.newConfiguration(sparkConf)
    +    val s2 = HiveUtils.newClientForMetadata(sparkConf, hadoopConf).getState
    --- End diff --
    
    weird, `HiveClientImpl` is the only implementation of the `HiveClient` interface.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131830510
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -269,6 +232,8 @@ private[hive] class HiveClientImpl(
         }
       }
     
    +  override def toString: String = state.toString
    --- End diff --
    
    yup


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by jiangxb1987 <gi...@git.apache.org>.
Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132157043
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -105,107 +105,69 @@ private[hive] class HiveClientImpl(
       // Create an internal session state for this HiveClientImpl.
       val state: SessionState = {
         val original = Thread.currentThread().getContextClassLoader
    -    // Switch to the initClassLoader.
    -    Thread.currentThread().setContextClassLoader(initClassLoader)
    -
    -    // Set up kerberos credentials for UserGroupInformation.loginUser within
    -    // current class loader
    -    if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    -      val principalName = sparkConf.get("spark.yarn.principal")
    -      val keytabFileName = sparkConf.get("spark.yarn.keytab")
    -      if (!new File(keytabFileName).exists()) {
    -        throw new SparkException(s"Keytab file: ${keytabFileName}" +
    -          " specified in spark.yarn.keytab does not exist")
    -      } else {
    -        logInfo("Attempting to login to Kerberos" +
    -          s" using principal: ${principalName} and keytab: ${keytabFileName}")
    -        UserGroupInformation.loginUserFromKeytab(principalName, keytabFileName)
    -      }
    -    }
    -
    -    def isCliSessionState(state: SessionState): Boolean = {
    -      var temp: Class[_] = if (state != null) state.getClass else null
    -      var found = false
    -      while (temp != null && !found) {
    -        found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
    -        temp = temp.getSuperclass
    +    if (clientLoader.isolationOn) {
    +      // Switch to the initClassLoader.
    +      Thread.currentThread().setContextClassLoader(initClassLoader)
    --- End diff --
    
    Is the behavior change safe here? Previously, we switch the context ClassLoader for both conditions, while in this PR we only do that if `isolationOn` is true.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r131814407
  
    --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveCliSessionStateSuite.scala ---
    @@ -0,0 +1,57 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.hive.thriftserver
    +
    +import org.apache.hadoop.hive.cli.CliSessionState
    +import org.apache.hadoop.hive.conf.HiveConf
    +import org.apache.hadoop.hive.ql.session.SessionState
    +
    +import org.apache.spark.{SparkConf, SparkFunSuite}
    +import org.apache.spark.deploy.SparkHadoopUtil
    +import org.apache.spark.sql.hive.HiveUtils
    +
    +class HiveCliSessionStateSuite extends SparkFunSuite {
    +
    +  test("CliSessionState will be reused") {
    +    val hiveConf = new HiveConf(classOf[SessionState])
    +    HiveUtils.newTemporaryConfiguration(useInMemoryDerby = false).foreach {
    +      case (key, value) => hiveConf.set(key, value)
    +    }
    +    val sessionState: SessionState = new CliSessionState(hiveConf)
    +    doTest(sessionState, true)
    +  }
    +
    +  test("SessionState will not be reused") {
    +    val hiveConf = new HiveConf(classOf[SessionState])
    +    HiveUtils.newTemporaryConfiguration(useInMemoryDerby = false).foreach {
    +      case (key, value) => hiveConf.set(key, value)
    +    }
    +    val sessionState: SessionState = new SessionState(hiveConf)
    +    doTest(sessionState, false)
    +  }
    +
    +  def doTest(state: SessionState, expected: Boolean): Unit = {
    +    SessionState.start(state)
    +    val s1 = SessionState.get
    +    val sparkConf = new SparkConf()
    +    val hadoopConf = SparkHadoopUtil.get.newConfiguration(sparkConf)
    +    val hiveClient = HiveUtils.newClientForMetadata(sparkConf, hadoopConf)
    +    assert((hiveClient.toString == s1.toString) === expected)
    --- End diff --
    
    is it safe to just compare `toString` result?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    **[Test build #80442 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80442/testReport)** for PR 18648 at commit [`6c0bf70`](https://github.com/apache/spark/commit/6c0bf709f95dec3ee3bce3e905e51f31fc5b0e64).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    ping @cloud-fan would you take another look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    **[Test build #80460 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80460/testReport)** for PR 18648 at commit [`c6ed2d7`](https://github.com/apache/spark/commit/c6ed2d7d0c5a8f3dc1e09f0a7d9991ede4a60a76).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18648: [SPARK-21428] Turn IsolatedClientLoader off while...

Posted by yaooqinn <gi...@git.apache.org>.
Github user yaooqinn commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18648#discussion_r132188293
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ---
    @@ -105,107 +105,69 @@ private[hive] class HiveClientImpl(
       // Create an internal session state for this HiveClientImpl.
       val state: SessionState = {
         val original = Thread.currentThread().getContextClassLoader
    -    // Switch to the initClassLoader.
    -    Thread.currentThread().setContextClassLoader(initClassLoader)
    -
    -    // Set up kerberos credentials for UserGroupInformation.loginUser within
    -    // current class loader
    -    if (sparkConf.contains("spark.yarn.principal") && sparkConf.contains("spark.yarn.keytab")) {
    -      val principalName = sparkConf.get("spark.yarn.principal")
    -      val keytabFileName = sparkConf.get("spark.yarn.keytab")
    -      if (!new File(keytabFileName).exists()) {
    -        throw new SparkException(s"Keytab file: ${keytabFileName}" +
    -          " specified in spark.yarn.keytab does not exist")
    -      } else {
    -        logInfo("Attempting to login to Kerberos" +
    -          s" using principal: ${principalName} and keytab: ${keytabFileName}")
    -        UserGroupInformation.loginUserFromKeytab(principalName, keytabFileName)
    -      }
    -    }
    -
    -    def isCliSessionState(state: SessionState): Boolean = {
    -      var temp: Class[_] = if (state != null) state.getClass else null
    -      var found = false
    -      while (temp != null && !found) {
    -        found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
    -        temp = temp.getSuperclass
    +    if (clientLoader.isolationOn) {
    +      // Switch to the initClassLoader.
    +      Thread.currentThread().setContextClassLoader(initClassLoader)
    --- End diff --
    
    when isolation Off, we just switch a classloader to itself


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18648
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80448/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org