You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2020/07/08 16:21:00 UTC
[jira] [Assigned] (SPARK-32227) Bug in load-spark-env.cmd with
Spark 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-32227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-32227:
------------------------------------
Assignee: (was: Apache Spark)
> Bug in load-spark-env.cmd with Spark 3.0.0
> -------------------------------------------
>
> Key: SPARK-32227
> URL: https://issues.apache.org/jira/browse/SPARK-32227
> Project: Spark
> Issue Type: Bug
> Components: Spark Shell
> Affects Versions: 3.0.0
> Environment: Windows 10
> Reporter: Ihor Bobak
> Priority: Major
> Fix For: 3.0.1
>
> Attachments: load-spark-env.cmd
>
>
> spark-env.cmd which is located in conf is not loaded by load-spark-env.cmd.
>
> *How to reproduce:*
> 1) download spark 3.0.0 without hadoop and extract it
> 2) put a file conf/spark-env.cmd with the following contents (paths are relative to where my hadoop is - in C:\opt\hadoop\hadoop-3.2.1, you may need to change):
>
> SET JAVA_HOME=C:\opt\Java\jdk1.8.0_241
> SET HADOOP_HOME=C:\opt\hadoop\hadoop-3.2.1
> SET HADOOP_CONF_DIR=C:\opt\hadoop\hadoop-3.2.1\conf
> SET SPARK_DIST_CLASSPATH=C:\opt\hadoop\hadoop-3.2.1\etc\hadoop;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\common;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\common\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\common*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\hdfs;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\hdfs\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\hdfs*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\yarn;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\yarn\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\yarn*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\mapreduce\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\mapreduce*
>
> 3) go to the bin directory and run pyspark. You will get an error that log4j can't be found, etc. (reason: the environment was not loaded indeed, it doesn't see where hadoop with all its jars is).
>
> *How to fix:*
> just take the load-spark-env.cmd from Spark version 2.4.3, and everything will work.
> [UPDATE]: I attached a fixed version of load-spark-env.cmd that works fine.
>
> *What is the difference?*
> I am not a good specialist in Windows batch, but doing a function
> :LoadSparkEnv
> if exist "%SPARK_CONF_DIR%\spark-env.cmd" (
> call "%SPARK_CONF_DIR%\spark-env.cmd"
> )
> and then calling it (as it was in 2.4.3) helps.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org