You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Chester (JIRA)" <ji...@apache.org> on 2016/01/13 08:34:39 UTC
[jira] [Created] (SPARK-12800) Subtle bug on Spark Yarn Client under Kerberos Security Mode

Chester created SPARK-12800:
-------------------------------

             Summary: Subtle bug on Spark Yarn Client under Kerberos Security Mode
                 Key: SPARK-12800
                 URL: https://issues.apache.org/jira/browse/SPARK-12800
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.5.2, 1.5.1
            Reporter: Chester


Version used: Spark 1.5.1 (1.5.2-SNAPSHOT) 
Deployment Mode: Yarn-Cluster
Problem observed: 
  When running spark job directly from YarnClient (without using spark-submit, I did not verify the spark-submit has the same issue or not), when kerberos security is enabled, the first time run spark job always fail. The failure is due to that the hadoop consider the job is in SIMPLE model rather than Kerberos mode.  But without shutting down the JVM, run the same job again, the spark job will pass. If one restart the JVM, then the spark job will fail again. 

The cause: 
  Tracking down the source of the issue, I found that the problem seems lie at the spark Yarn Client.scala. In the Client

def prepareLocalResources() method  L 266 of Client.java, the following line code is called. 

 YarnSparkHadoopUtil.get.obtainTokensForNamenodes(nns, hadoopConf, credentials)
    
The YarnSparkHadoopUtil.get is in turns get initialized via reflection


object SparkHadoopUtil {

  private val hadoop = {
    val yarnMode = java.lang.Boolean.valueOf(
        System.getProperty("SPARK_YARN_MODE", System.getenv("SPARK_YARN_MODE")))
    if (yarnMode) {
      try {
        Utils.classForName("org.apache.spark.deploy.yarn.YarnSparkHadoopUtil")
          .newInstance()
          .asInstanceOf[SparkHadoopUtil]
      } catch {
       case e: Exception => throw new SparkException("Unable to load YARN support", e)
      }
    } else {
      new SparkHadoopUtil
    }
  } 

  def get: SparkHadoopUtil = {
    hadoop
  }
}

 

class SparkHadoopUtil extends Logging {
  private val sparkConf = new SparkConf()
  val conf: Configuration = newConfiguration(sparkConf)
  UserGroupInformation.setConfiguration(conf)

   .... rest of line
}

Here SparkHadoopUtil creates a empty SparkConf and Hadoop Configuration from that and set to UserGroupInformation

  UserGroupInformation.setConfiguration(conf)


  As the UserGroupInformation.authenticationMethod is static, above all wipe out the security settings. UserGroupInformation.isSecurityEnabled() changed from true to false. Thus the sequence call will fail. 

 Since the SparkHadoopUtil.hadoop is static/non-mutable variable, so 
the next run it will be not create again, then UserGroupInformation.setConfiguration(conf) 
will not be called again, so the sequence spark job works. 

The work around: 
        //first initialize the SparkHadoopUtil, which will create a static instance
        //which will set UserGroupInformation to a empty hadoop Configuration.
        //we will need to reset the UserGroupInformation after that.
        val util = SparkHadoopUtil.get
        UserGroupInformation.setConfiguration(hadoopConf)

      Then call 
          client.run() 


   



 



     














    



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org