You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by David Robison <da...@psgglobal.net> on 2016/11/15 21:45:28 UTC

Problem submitting a spark job using yarn-client as master

I am trying to submit a spark job through the yarn-client master setting. The job gets created and submitted to the clients but immediately errors out. Here is the relevant portion of the log:

15:39:37,385 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Requesting a new application from cluster with 1 NodeManagers
15:39:37,397 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Verifying our application has not requested more than the maximum memory capability of the cluster (4608 MB per container)
15:39:37,398 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Will allocate AM container, with 896 MB memory including 384 MB overhead
15:39:37,399 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Setting up container launch context for our AM
15:39:37,403 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Setting up the launch environment for our AM container
15:39:37,427 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Preparing resources for our AM container
15:39:37,845 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Source and destination file systems are the same. Not copying file:/opt/wildfly/modules/org/apache/hadoop/client/main/spark-yarn_2.10-1.6.2.jar
15:39:38,050 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Source and destination file systems are the same. Not copying file:/tmp/spark-fa954c4a-a6cd-4675-8610-67ce858b4842/__spark_conf__1435451360463636119.zip
15:39:38,102 INFO  [org.apache.spark.SecurityManager] (default task-1) Changing view acls to: wildfly,hdfs
15:39:38,105 INFO  [org.apache.spark.SecurityManager] (default task-1) Changing modify acls to: wildfly,hdfs
15:39:38,105 INFO  [org.apache.spark.SecurityManager] (default task-1) SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(wildfly, hdfs); users with modify permissions: Set(wildfly, hdfs)
15:39:38,138 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Submitting application 5 to ResourceManager
15:39:38,256 INFO  [org.apache.hadoop.yarn.client.api.impl.YarnClientImpl] (default task-1) Submitted application application_1479240217825_0005
15:39:39,269 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: ACCEPTED)
15:39:39,279 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1)
                     client token: N/A
                    diagnostics: N/A
                    ApplicationMaster host: N/A
                    ApplicationMaster RPC port: -1
                    queue: default
                    start time: 1479242378159
                    final status: UNDEFINED
                    tracking URL: http://vb1.localdomain:8088/proxy/application_1479240217825_0005/
                    user: hdfs
15:39:40,285 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: ACCEPTED)
15:39:41,290 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: ACCEPTED)
15:39:42,295 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: FAILED)
15:39:42,295 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1)
                     client token: N/A
                    diagnostics: Application application_1479240217825_0005 failed 2 times due to AM Container for appattempt_1479240217825_0005_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:http://vb1.localdomain:8088/cluster/app/application_1479240217825_0005Then, click on links to logs of each attempt.
Diagnostics: File file:/tmp/spark-fa954c4a-a6cd-4675-8610-67ce858b4842/__spark_conf__1435451360463636119.zip does not exist
java.io.FileNotFoundException: File file:/tmp/spark-fa954c4a-a6cd-4675-8610-67ce858b4842/__spark_conf__1435451360463636119.zip does not exist
                    at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)
                    at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
                    at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
                    at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)


Notice that the file __spark_conf__1435451360463636119.zip is not copied because it exists, I believe on the hdfs. However when the client goes to fetch it, it is reporting that it does not exist, probably because it is trying to get it from "file:/tmp" not the hdfs. Any idea how I can get this to work?
Thanks, David

David R Robison
Senior Systems Engineer
O. +1 512 247 3700
M. +1 757 286 0022
david.robison@psgglobal.net<ma...@psgglobal.net>
www.psgglobal.net<http://www.psgglobal.net/>

Prometheus Security Group Global, Inc.
3019 Alvin Devane Boulevard
Building 4, Suite 450
Austin, TX 78741

RE: Problem submitting a spark job using yarn-client as master

Posted by David Robison <da...@psgglobal.net>.

Unfortunately, it doesn’t get that far in my code where I have a SparkContext from which to set the Hadoop config parameters. Here is my Java code:

SparkConf sparkConf = new SparkConf()
       .setJars(new String[] { "file:///opt/wildfly/mapreduce/mysparkjob-5.0.0.jar", })
       .setSparkHome("/usr/hdp/" + getHdpVersion() + "/spark")
       .set("fs.defaultFS", config.get("fs.defaultFS"))
       ;
sparkContext = new JavaSparkContext("yarn-client", "SumFramesPerTimeUnit", sparkConf);

The job dies in the constructor of the JavaSparkContext. I have a logging call right after creating the SparkContext and it is never executied.
Any idea what I’m doing wrong? David

Best Regards,

David R Robison
Senior Systems Engineer
[cid:image004.png@01D19182.F24CA3E0]

From: Rohit Verma [mailto:rohit.verma@rokittech.com]
Sent: Tuesday, November 15, 2016 9:27 PM
To: David Robison <da...@psgglobal.net>
Cc: user@spark.apache.org
Subject: Re: Problem submitting a spark job using yarn-client as master

you can set hdfs as defaults,

sparksession.sparkContext().hadoopConfiguration().set("fs.defaultFS", “hdfs://master_node:8020”);

Regards
Rohit

On Nov 16, 2016, at 3:15 AM, David Robison <da...@psgglobal.net>> wrote:

I am trying to submit a spark job through the yarn-client master setting. The job gets created and submitted to the clients but immediately errors out. Here is the relevant portion of the log:

15:39:37,385 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Requesting a new application from cluster with 1 NodeManagers
15:39:37,397 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Verifying our application has not requested more than the maximum memory capability of the cluster (4608 MB per container)
15:39:37,398 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Will allocate AM container, with 896 MB memory including 384 MB overhead
15:39:37,399 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Setting up container launch context for our AM
15:39:37,403 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Setting up the launch environment for our AM container
15:39:37,427 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Preparing resources for our AM container
15:39:37,845 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Source and destination file systems are the same. Not copying file:/opt/wildfly/modules/org/apache/hadoop/client/main/spark-yarn_2.10-1.6.2.jar
15:39:38,050 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Source and destination file systems are the same. Not copying file:/tmp/spark-fa954c4a-a6cd-4675-8610-67ce858b4842/__spark_conf__1435451360463636119.zip
15:39:38,102 INFO  [org.apache.spark.SecurityManager] (default task-1) Changing view acls to: wildfly,hdfs
15:39:38,105 INFO  [org.apache.spark.SecurityManager] (default task-1) Changing modify acls to: wildfly,hdfs
15:39:38,105 INFO  [org.apache.spark.SecurityManager] (default task-1) SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(wildfly, hdfs); users with modify permissions: Set(wildfly, hdfs)
15:39:38,138 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Submitting application 5 to ResourceManager
15:39:38,256 INFO  [org.apache.hadoop.yarn.client.api.impl.YarnClientImpl] (default task-1) Submitted application application_1479240217825_0005
15:39:39,269 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: ACCEPTED)
15:39:39,279 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1)
                     client token: N/A
                    diagnostics: N/A
                    ApplicationMaster host: N/A
                    ApplicationMaster RPC port: -1
                    queue: default
                    start time: 1479242378159
                    final status: UNDEFINED
                    tracking URL: http://vb1.localdomain:8088/proxy/application_1479240217825_0005/
                    user: hdfs
15:39:40,285 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: ACCEPTED)
15:39:41,290 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: ACCEPTED)
15:39:42,295 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: FAILED)
15:39:42,295 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1)
                     client token: N/A
                    diagnostics: Application application_1479240217825_0005 failed 2 times due to AM Container for appattempt_1479240217825_0005_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:http://vb1.localdomain:8088/cluster/app/application_1479240217825_0005Then, click on links to logs of each attempt.
Diagnostics: File file:/tmp/spark-fa954c4a-a6cd-4675-8610-67ce858b4842/__spark_conf__1435451360463636119.zip does not exist
java.io.FileNotFoundException: File file:/tmp/spark-fa954c4a-a6cd-4675-8610-67ce858b4842/__spark_conf__1435451360463636119.zip does not exist
                    at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)
                    at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
                    at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
                    at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)


Notice that the file __spark_conf__1435451360463636119.zip is not copied because it exists, I believe on the hdfs. However when the client goes to fetch it, it is reporting that it does not exist, probably because it is trying to get it from “file:/tmp” not the hdfs. Any idea how I can get this to work?
Thanks, David

David R Robison
Senior Systems Engineer
O. +1 512 247 3700
M. +1 757 286 0022
david.robison@psgglobal.net<ma...@psgglobal.net>
www.psgglobal.net<http://www.psgglobal.net/>
<image001.png>
Prometheus Security Group Global, Inc.
3019 Alvin Devane Boulevard
Building 4, Suite 450
Austin, TX 78741
<image001.png>

Re: Problem submitting a spark job using yarn-client as master

Posted by Rohit Verma <ro...@rokittech.com>.

you can set hdfs as defaults,

sparksession.sparkContext().hadoopConfiguration().set("fs.defaultFS", “hdfs://master_node:8020”);

Regards
Rohit

On Nov 16, 2016, at 3:15 AM, David Robison <da...@psgglobal.net>> wrote:

I am trying to submit a spark job through the yarn-client master setting. The job gets created and submitted to the clients but immediately errors out. Here is the relevant portion of the log:

15:39:37,385 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Requesting a new application from cluster with 1 NodeManagers
15:39:37,397 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Verifying our application has not requested more than the maximum memory capability of the cluster (4608 MB per container)
15:39:37,398 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Will allocate AM container, with 896 MB memory including 384 MB overhead
15:39:37,399 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Setting up container launch context for our AM
15:39:37,403 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Setting up the launch environment for our AM container
15:39:37,427 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Preparing resources for our AM container
15:39:37,845 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Source and destination file systems are the same. Not copying file:/opt/wildfly/modules/org/apache/hadoop/client/main/spark-yarn_2.10-1.6.2.jar
15:39:38,050 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Source and destination file systems are the same. Not copying file:/tmp/spark-fa954c4a-a6cd-4675-8610-67ce858b4842/__spark_conf__1435451360463636119.zip
15:39:38,102 INFO  [org.apache.spark.SecurityManager] (default task-1) Changing view acls to: wildfly,hdfs
15:39:38,105 INFO  [org.apache.spark.SecurityManager] (default task-1) Changing modify acls to: wildfly,hdfs
15:39:38,105 INFO  [org.apache.spark.SecurityManager] (default task-1) SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(wildfly, hdfs); users with modify permissions: Set(wildfly, hdfs)
15:39:38,138 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Submitting application 5 to ResourceManager
15:39:38,256 INFO  [org.apache.hadoop.yarn.client.api.impl.YarnClientImpl] (default task-1) Submitted application application_1479240217825_0005
15:39:39,269 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: ACCEPTED)
15:39:39,279 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1)
                     client token: N/A
                    diagnostics: N/A
                    ApplicationMaster host: N/A
                    ApplicationMaster RPC port: -1
                    queue: default
                    start time: 1479242378159
                    final status: UNDEFINED
                    tracking URL: http://vb1.localdomain:8088/proxy/application_1479240217825_0005/
                    user: hdfs
15:39:40,285 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: ACCEPTED)
15:39:41,290 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: ACCEPTED)
15:39:42,295 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1) Application report for application_1479240217825_0005 (state: FAILED)
15:39:42,295 INFO  [org.apache.spark.deploy.yarn.Client] (default task-1)
                     client token: N/A
                    diagnostics: Application application_1479240217825_0005 failed 2 times due to AM Container for appattempt_1479240217825_0005_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:http://vb1.localdomain:8088/cluster/app/application_1479240217825_0005Then, click on links to logs of each attempt.
Diagnostics: File file:/tmp/spark-fa954c4a-a6cd-4675-8610-67ce858b4842/__spark_conf__1435451360463636119.zip does not exist
java.io.FileNotFoundException: File file:/tmp/spark-fa954c4a-a6cd-4675-8610-67ce858b4842/__spark_conf__1435451360463636119.zip does not exist
                    at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:609)
                    at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:822)
                    at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:599)
                    at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)


Notice that the file __spark_conf__1435451360463636119.zip is not copied because it exists, I believe on the hdfs. However when the client goes to fetch it, it is reporting that it does not exist, probably because it is trying to get it from “file:/tmp” not the hdfs. Any idea how I can get this to work?
Thanks, David

David R Robison
Senior Systems Engineer
O. +1 512 247 3700
M. +1 757 286 0022
david.robison@psgglobal.net<ma...@psgglobal.net>
www.psgglobal.net<http://www.psgglobal.net/>
<image001.png>
Prometheus Security Group Global, Inc.
3019 Alvin Devane Boulevard
Building 4, Suite 450
Austin, TX 78741
<image001.png>