You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Christopher Meier (Jira)" <ji...@apache.org> on 2019/11/20 15:53:00 UTC
[jira] [Created] (SPARK-29974) Submitting with application jar on HA HDFS

Christopher Meier created SPARK-29974:
-----------------------------------------

             Summary: Submitting with application jar on HA HDFS 
                 Key: SPARK-29974
                 URL: https://issues.apache.org/jira/browse/SPARK-29974
             Project: Spark
          Issue Type: Bug
          Components: Kubernetes, Spark Submit
    Affects Versions: 2.4.4
            Reporter: Christopher Meier


When submitting a job with the application jar in an HA HDFS and with the HDFS configuration available to both the driver and the executors at $HADOOP_CONF_DIR, the executor can't fetch the application jar.

 

For example with Kubernetes:
 # Create a Spark image with the HA HDFS configuration files available at $HADOOP_CONF_DIR.
 # Push the application jar to the HA HDFS.
 # Use spark-submit to create the spark job in the cluster
{code:sh}
spark-submit \
	--master k8s://https://kubernetes.example:6443 \
	--deploy-mode cluster \
	--name spark_hdfs_test \
	--class $CLASS \
	--conf spark.executor.instances=3 \
	--conf spark.kubernetes.container.image=$SPARK_IMAGE \
	--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
	hdfs:///jars/application.jar
{code}
 On the driver, all goes well, but the following error shows on the log of all executors:
{code:java}
...
19/11/20 12:45:43 INFO Executor: Fetching hdfs://hdfs-k8s/jars/application.jar with timestamp 1574253925510
19/11/20 12:45:43 ERROR Executor: Exception in task 0.1 in stage 0.0 (TID 1)
java.lang.IllegalArgumentException: java.net.UnknownHostException: hdfs-k8s
	at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
	at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
	at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
	at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
	at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1866)
	at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:721)
	at org.apache.spark.util.Utils$.fetchFile(Utils.scala:496)
	at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:811)
	at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:803)
	at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
	at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
	at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
	at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
	at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
	at scala.collection.mutable.HashMap.foreach(HashMap.scala:130)
	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
	at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:803)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:375)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: hdfs-k8s
	... 28 more
{code}
The traceback suggests that when the executor wants to fetch the application jar, it does not understand that the path corresponds to an HA HDFS. (Which it should as the HDFS HA configuration is available).

However when the path to the application jar is set with the address to the active namenode, then all works well. Even the code in the jar which itself uses HA HDFS (hdfs:///some-file.txt).
{code:sh}
// code placeholder
spark-submit \
	--master k8s://https://kubernetes.example:6443 \
	--deploy-mode cluster \
	--name spark_hdfs_test \
	--class $CLASS \
	--conf spark.executor.instances=3 \
	--conf spark.kubernetes.container.image=$SPARK_IMAGE \
	--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
	hdfs://hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:8020/jars/application.jar
{code}

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org