You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2020/07/08 16:21:00 UTC

[jira] [Assigned] (SPARK-32227) Bug in load-spark-env.cmd with Spark 3.0.0

     [ https://issues.apache.org/jira/browse/SPARK-32227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-32227:
------------------------------------

    Assignee:     (was: Apache Spark)

> Bug in load-spark-env.cmd  with Spark 3.0.0
> -------------------------------------------
>
>                 Key: SPARK-32227
>                 URL: https://issues.apache.org/jira/browse/SPARK-32227
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell
>    Affects Versions: 3.0.0
>         Environment: Windows 10
>            Reporter: Ihor Bobak
>            Priority: Major
>             Fix For: 3.0.1
>
>         Attachments: load-spark-env.cmd
>
>
> spark-env.cmd  which is located in conf  is not loaded by load-spark-env.cmd.
>  
> *How to reproduce:*
> 1) download spark 3.0.0 without hadoop and extract it
> 2) put a file conf/spark-env.cmd with the following contents (paths are relative to where my hadoop is - in C:\opt\hadoop\hadoop-3.2.1, you may need to change):
>  
> SET JAVA_HOME=C:\opt\Java\jdk1.8.0_241
>  SET HADOOP_HOME=C:\opt\hadoop\hadoop-3.2.1
>  SET HADOOP_CONF_DIR=C:\opt\hadoop\hadoop-3.2.1\conf
>  SET SPARK_DIST_CLASSPATH=C:\opt\hadoop\hadoop-3.2.1\etc\hadoop;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\common;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\common\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\common*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\hdfs;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\hdfs\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\hdfs*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\yarn;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\yarn\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\yarn*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\mapreduce\lib*;C:\opt\hadoop\hadoop-3.2.1\share\hadoop\mapreduce*
>  
> 3) go to the bin directory and run pyspark.   You will get an error that log4j can't be found, etc. (reason: the environment was not loaded indeed, it doesn't see where hadoop with all its jars is).
>  
> *How to fix:*
> just take the load-spark-env.cmd  from Spark version 2.4.3, and everything will work.
> [UPDATE]:  I attached a fixed version of load-spark-env.cmd  that works fine.
>  
> *What is the difference?*
> I am not a good specialist in Windows batch, but doing a function
> :LoadSparkEnv
>  if exist "%SPARK_CONF_DIR%\spark-env.cmd" (
>   call "%SPARK_CONF_DIR%\spark-env.cmd"
>  )
> and then calling it (as it was in 2.4.3) helps.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org