You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Alberto Andreotti <al...@whiteprompt.com> on 2016/10/05 00:41:38 UTC

Problem creating SparkContext to connect to YARN cluster

Hello guys,

I'm new here. I'm using Spark 1.6.0, and I'm trying to programmatically
access a Yarn cluster from my scala app.
I create a SparkContext as usual, with the following code,

val sc = SparkContext.getOrCreate(new SparkConf().setMaster("yarn-client"))

My yarn-site.xml is being read correctly, as far as I know.
My guess is that for some reason some files are not being sent to the
cluster. I get the following exception,

info] o.a.s.d.y.Client - Setting up container launch context for our AM
[info] o.a.s.d.y.Client - Setting up the launch environment for our AM
container
[info] o.a.s.d.y.Client - Preparing resources for our AM container
[info] o.a.s.d.y.Client - Source and destination file systems are the same.
Not copying
file:/home/jose/.ivy2/cache/org.apache.spark/spark-yarn_2.11/jars/spark-yarn_2.11-1.6.0.jar
[info] o.a.s.d.y.Client - Source and destination file systems are the same.
Not copying
file:/tmp/spark-df384d4c-2d8c-4101-b1c2-7caee897e227/__spark_conf__4685218164631909844.zip
[info] o.a.s.SecurityManager - Changing view acls to: jose
[info] o.a.s.SecurityManager - Changing modify acls to: jose
[info] o.a.s.SecurityManager - SecurityManager: authentication disabled; ui
acls disabled; users with view permissions: Set(jose); users with modify
permissions: Set(jose)
[info] o.a.s.d.y.Client - Submitting application 674 to ResourceManager
[info] o.a.h.y.c.a.i.YarnClientImpl - Submitted application
application_1462219356760_0674 to ResourceManager at /10.10.10.142:8032
[info] o.a.s.d.y.Client - Application report for
application_1462219356760_0674 (state: ACCEPTED)
[info] o.a.s.d.y.Client -
     client token: N/A
     diagnostics: N/A
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1475613117174
     final status: UNDEFINED
     tracking URL:
http://server_address:20888/proxy/application_1462219356760_0674/
     user: jose
[info] o.a.s.d.y.Client - Application report for
application_1462219356760_0674 (state: ACCEPTED)
[info] o.a.s.d.y.Client - Application report for
application_1462219356760_0674 (state: FAILED)
[info] o.a.s.d.y.Client -
     client token: N/A
     diagnostics: Application application_1462219356760_0674 failed 2 times
due to AM Container for appattempt_1462219356760_0674_000002 exited with
exitCode: -1000
For more detailed output, check application tracking page:
http://server_address:8088/cluster/app/application_1462219356760_0674Then,
click on links to logs of each attempt.
Diagnostics: java.io.FileNotFoundException: File
file:/home/jose/.ivy2/cache/org.apache.spark/spark-yarn_2.11/jars/spark-yarn_2.11-1.6.0.jar
does not exist
Failing this attempt. Failing the application.
     ApplicationMaster host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1475613117174
     final status: FAILED
     tracking URL:
http://server_address:8088/cluster/app/application_1462219356760_0674
     user: jose
[error] o.a.s.SparkContext - Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It
might have been killed or unable to launch application master.
    at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:124)
~[spark-yarn_2.11-1.6.0.jar:1.6.0]
    at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:64)
~[spark-yarn_2.11-1.6.0.jar:1.6.0]
    at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
~[spark-core_2.11-1.6.0.jar:1.6.0]
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
~[spark-core_2.11-1.6.0.jar:1.6.0]
    at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2281)
[spark-core_2.11-1.6.0.jar:1.6.0]

thanks for any help!,
Alberto.