You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Monica Raj (JIRA)" <ji...@apache.org> on 2017/06/20 21:49:00 UTC

[jira] [Commented] (SPARK-21156) Spark cannot handle multiple KMS server configuration

    [ https://issues.apache.org/jira/browse/SPARK-21156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056524#comment-16056524 ] 

Monica Raj commented on SPARK-21156:
------------------------------------

This issue has been seen with spark-1.6.1 and Ranger KMS from HortonWorks and also with spark-2.1.1 and KeyTrustee KMS from Cloudera

> Spark cannot handle multiple KMS server configuration
> -----------------------------------------------------
>
>                 Key: SPARK-21156
>                 URL: https://issues.apache.org/jira/browse/SPARK-21156
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell, Spark Submit, YARN
>    Affects Versions: 1.6.1, 2.1.1
>            Reporter: Monica Raj
>
> The *dfs.encryption.key.provider.uri* config parameter present in the *hdfs-site.xml* file can have one or more key servers in the value. The syntax for this value is:
> <name>dfs.encryption.key.provider.uri</name>
> <value>kms://http@<internal host name1>;<internal host name2>;...:9292/kms</value>
> as per documentation: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_security/content/ranger_kms_multi_kms.html
> If multiple KMS servers are configured in the above field AND the following Spark config values are specified:
> *spark.yarn.principal*
> *spark.yarn.keytab*
> then it is not possible to create a spark context. There is an error parsing the syntax for multiple KMS servers.  Below is the stack trace for the same error seen via Spark shell and also seen via Zeppelin.
> *Error via Spark Shell*
> 17/06/16 22:02:11 ERROR SparkContext: Error initializing SparkContext.
> java.lang.IllegalArgumentException: java.net.UnknownHostException: mas1.multikms.com;mas2.multikms.com
> 	at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
> 	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationTokenService(KMSClientProvider.java:804)
> 	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:779)
> 	at org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046)
> 	at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:131)
> 	at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:128)
> 	at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
> 	at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.obtainTokensForNamenodes(YarnSparkHadoopUtil.scala:128)
> 	at org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:593)
> 	at org.apache.spark.deploy.yarn.Client.setupLaunchEnv(Client.scala:626)
> 	at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:726)
> 	at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
> 	at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
> 	at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
> 	at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:1017)
> 	at $line3.$read$$iwC$$iwC.<init>(<console>:15)
> 	at $line3.$read$$iwC.<init>(<console>:24)
> 	at $line3.$read.<init>(<console>:26)
> 	at $line3.$read$.<init>(<console>:30)
> 	at $line3.$read$.<clinit>(<console>)
> 	at $line3.$eval$.<init>(<console>:7)
> 	at $line3.$eval$.<clinit>(<console>)
> 	at $line3.$eval.$print(<console>)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
> 	at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
> 	at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
> 	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
> 	at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
> 	at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
> 	at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
> 	at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
> 	at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:125)
> 	at org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:124)
> 	at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:324)
> 	at org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:124)
> 	at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:64)
> 	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:974)
> 	at org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:159)
> 	at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:64)
> 	at org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:108)
> 	at org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:64)
> 	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:991)
> 	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
> 	at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
> 	at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
> 	at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
> 	at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
> 	at org.apache.spark.repl.Main$.main(Main.scala:31)
> 	at org.apache.spark.repl.Main.main(Main.scala)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> 	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
> 	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.net.UnknownHostException: mas1.multikms.com;mas2.multikms.com
> 	... 64 more
> *Error via Zeppelin*
> ERROR [2017-06-19 23:24:49,508] ({pool-2-thread-2} Logging.scala[logError]:95) - Error initializing SparkContext.
>  70 java.lang.IllegalArgumentException: java.net.UnknownHostException: mas1.multikms.com;mas2.multikms.com
>  71         at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
>  72         at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationTokenService(KMSClientProvider.java:804)
>  73         at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:779)
>  74         at org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
>  75         at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2046)
>  76         at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:131)
>  77         at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil$$anonfun$obtainTokensForNamenodes$1.apply(YarnSparkHadoopUtil.scala:128)
>  78         at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
>  79         at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.obtainTokensForNamenodes(YarnSparkHadoopUtil.scala:128)
>  80         at org.apache.spark.deploy.yarn.Client.getTokenRenewalInterval(Client.scala:593)
>  81         at org.apache.spark.deploy.yarn.Client.setupLaunchEnv(Client.scala:626)
>  82         at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:726)
>  83         at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
>  84         at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>  85         at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
>  86         at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
>  87         at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:338)
>  88         at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:122)
>  89         at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:513)
>  90         at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>  91         at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
>  92         at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
>  93         at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
>  94         at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
>  95         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  96         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  97         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  98         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  99         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 100         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 101         at java.lang.Thread.run(Thread.java:748)
> 102 Caused by: java.net.UnknownHostException: mas1.multikms.com;mas2.multikms.com
> 103         ... 31 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org