You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Monica Raj (JIRA)" <ji...@apache.org> on 2017/06/20 20:39:00 UTC

[jira] [Created] (SPARK-21156) Spark token renewal is not compatible multiple Ranger KMS server configuration

Monica Raj created SPARK-21156:
----------------------------------

             Summary: Spark token renewal is not compatible multiple Ranger KMS server configuration
                 Key: SPARK-21156
                 URL: https://issues.apache.org/jira/browse/SPARK-21156
             Project: Spark
          Issue Type: Bug
          Components: Spark Shell, Spark Submit, YARN
    Affects Versions: 2.1.1, 1.6.1
            Reporter: Monica Raj


The *dfs.encryption.key.provider.uri* config parameter present in the *hdfs-site.xml* file can have one or more key servers in the value. The syntax for this value is:
<name>dfs.encryption.key.provider.uri</name>
<value>kms://http@<internal host name1>;<internal host name2>;...:9292/kms</value>

as per documentation: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_security/content/ranger_kms_multi_kms.html

If multiple KMS servers are configured in the above field AND the following Spark config values are specified:
*spark.yarn.principal*
*spark.yarn.keytab*

then it is not possible to create a spark context. There is an error parsing the syntax for multiple KMS servers.  Below is the stack trace for the same error seen via Spark shell and also seen via Zeppelin.

*Error via Spark Shell*
17/06/16 22:02:11 ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: java.net.UnknownHostException: mas1.multikms.com;mas2.multikms.com
	at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationTokenService(KMSClientProvider.java:804)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:779)
	at org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
	at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046)
	at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:131)
	at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:128)
	at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
	at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.obtainTokensForNamenodes(YarnSparkHadoopUtil.scala:128)
	at org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:593)
	at org.apache.spark.deploy.yarn.Client.setupLaunchEnv(Client.scala:626)
	at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:726)
	at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
	at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
	at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
	at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1017)
	at $line3.$read$$iwC$$iwC.<init>(<console>:15)
	at $line3.$read$$iwC.<init>(<console>:24)
	at $line3.$read.<init>(<console>:26)
	at $line3.$read$.<init>(<console>:30)
	at $line3.$read$.<clinit>(<console>)
	at $line3.$eval$.<init>(<console>:7)
	at $line3.$eval$.<clinit>(<console>)
	at $line3.$eval.$print(<console>)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
	at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
	at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
	at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
	at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
	at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
	at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:125)
	at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
	at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
	at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
	at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
	at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
	at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
	at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
	at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
	at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
	at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
	at org.apache.spark.repl.Main$.main(Main.scala:31)
	at org.apache.spark.repl.Main.main(Main.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.UnknownHostException: mas1.multikms.com;mas2.multikms.com
	... 64 more


*Error via Zeppelin*
ERROR [2017-06-19 23:24:49,508] ({pool-2-thread-2} Logging.scala[logError]:95) - Error initializing SparkContext.
 70 java.lang.IllegalArgumentException: java.net.UnknownHostException: mas1.multikms.com;mas2.multikms.com
 71         at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
 72         at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationTokenService(KMSClientProvider.java:804)
 73         at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:779)
 74         at org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
 75         at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046)
 76         at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:131)
 77         at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:128)
 78         at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
 79         at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.obtainTokensForNamenodes(YarnSparkHadoopUtil.scala:128)
 80         at org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:593)
 81         at org.apache.spark.deploy.yarn.Client.setupLaunchEnv(Client.scala:626)
 82         at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:726)
 83         at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
 84         at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
 85         at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
 86         at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
 87         at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:338)
 88         at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:122)
 89         at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:513)
 90         at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
 91         at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
 92         at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
 93         at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
 94         at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
 95         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 96         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 97         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 98         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 99         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
100         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
101         at java.lang.Thread.run(Thread.java:748)
102 Caused by: java.net.UnknownHostException: mas1.multikms.com;mas2.multikms.com
103         ... 31 more




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org