You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Ángel Álvarez (JIRA)" <ji...@apache.org> on 2014/11/11 13:57:33 UTC
[jira] [Updated] (SPARK-1825) Windows Spark fails to work with Linux YARN

     [ https://issues.apache.org/jira/browse/SPARK-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ángel Álvarez updated SPARK-1825:
---------------------------------
    Attachment: SPARK-1825.patch

Is it really necessary to change the file "ExecutorRunnableUtil.scala"?

I'd just changed the file "ClientBase.scala" and it (apparently) works for Spark 1.1.

In order to make it work, you'll have to add the following configuration

	- Program arguments: --master yarn-cluster
	- VM arguments: -Dspark.app-submission.cross-platform=true
	


> Windows Spark fails to work with Linux YARN
> -------------------------------------------
>
>                 Key: SPARK-1825
>                 URL: https://issues.apache.org/jira/browse/SPARK-1825
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.0.0
>            Reporter: Taeyun Kim
>             Fix For: 1.2.0
>
>         Attachments: SPARK-1825.patch
>
>
> Windows Spark fails to work with Linux YARN.
> This is a cross-platform problem.
> This error occurs when 'yarn-client' mode is used.
> (yarn-cluster/yarn-standalone mode was not tested.)
> On YARN side, Hadoop 2.4.0 resolved the issue as follows:
> https://issues.apache.org/jira/browse/YARN-1824
> But Spark YARN module does not incorporate the new YARN API yet, so problem persists for Spark.
> First, the following source files should be changed:
> - /yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala
> - /yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnableUtil.scala
> Change is as follows:
> - Replace .$() to .$$()
> - Replace File.pathSeparator for Environment.CLASSPATH.name to ApplicationConstants.CLASS_PATH_SEPARATOR (import org.apache.hadoop.yarn.api.ApplicationConstants is required for this)
> Unless the above are applied, launch_container.sh will contain invalid shell script statements(since they will contain Windows-specific separators), and job will fail.
> Also, the following symptom should also be fixed (I could not find the relevant source code):
> - SPARK_HOME environment variable is copied straight to launch_container.sh. It should be changed to the path format for the server OS, or, the better, a separate environment variable or a configuration variable should be created.
> - '%HADOOP_MAPRED_HOME%' string still exists in launch_container.sh, after the above change is applied. maybe I missed a few lines.
> I'm not sure whether this is all, since I'm new to both Spark and YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org