You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zeppelin.apache.org by "Arnaud Nauwynck (Jira)" <ji...@apache.org> on 2022/09/13 20:24:00 UTC

[jira] [Created] (ZEPPELIN-5817) Failed to run spark job from Zeppelin on windows... can not execute "spark-submit" is not a valid Win32 application, need to call cmd.exe

Arnaud Nauwynck created ZEPPELIN-5817:
-----------------------------------------

             Summary: Failed to run spark job from Zeppelin on windows... can not execute "spark-submit" is not a valid Win32 application, need to call cmd.exe
                 Key: ZEPPELIN-5817
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5817
             Project: Zeppelin
          Issue Type: Bug
          Components: interpreter-launcher
    Affects Versions: 0.10.1, 0.10.0, 0.9.0
            Reporter: Arnaud Nauwynck



{noformat}
Caused by: java.io.IOException: Fail to detect scala version, the reason is:Cannot run program "C:/apps/hadoop/spark-3.1.1/bin/spark-submit": CreateProcess error=193, %1 is not a valid Win32 application
	at org.apache.zeppelin.interpreter.launcher.SparkInterpreterLauncher.buildEnvFromProperties(SparkInterpreterLauncher.java:127)
	at org.apache.zeppelin.interpreter.launcher.StandardInterpreterLauncher.launchDirectly(StandardInterpreterLauncher.java:77)
	at org.apache.zeppelin.interpreter.launcher.InterpreterLauncher.launch(InterpreterLauncher.java:110)
{noformat}

Indeed, looking at source code, we can see it might run only on linux where shell script file "spark-submit" have both "chmod u+x" and bang "#!/bin/bash" 
On Windows, a text file containing a shell script is not executable, because Windows only supports ".exe" file format

Instead on windows, it should be calling "cmd.exe" with arguments [ "/c", "spark-submit", "--version" ]


Source code link:

https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncher.java#L270

{noformat}
  private String detectSparkScalaVersion(String sparkHome, Map<String, String> env) throws Exception {
...
    ProcessBuilder builder = new ProcessBuilder(sparkHome + "/bin/spark-submit", "--version");
...
    Process process = builder.start();

{noformat}


And there is no possibility to by-pass this, it is always called from buildEnvFromProperties() : 

https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncher.java#L134

{noformat}
  @Override
  public Map<String, String> buildEnvFromProperties(InterpreterLaunchContext context) throws IOException {

..

    String scalaVersion = null;
    try {
      String sparkHome = getEnv("SPARK_HOME");
      LOGGER.info("SPARK_HOME: {}", sparkHome);
      scalaVersion = detectSparkScalaVersion(sparkHome, env);
      LOGGER.info("Scala version for Spark: {}", scalaVersion);
      context.getProperties().put("zeppelin.spark.scala.version", scalaVersion);
    } catch (Exception e) {
      throw new IOException("Fail to detect scala version, the reason is:"+ e.getMessage());
    }
..
{noformat}

Then calling




--
This message was sent by Atlassian Jira
(v8.20.10#820010)