You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/10/11 03:23:00 UTC

[jira] [Assigned] (SPARK-40738) spark-shell fails with "bad array subscript" in cygwin or msys bash session

     [ https://issues.apache.org/jira/browse/SPARK-40738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-40738:
------------------------------------

    Assignee:     (was: Apache Spark)

> spark-shell fails with "bad array subscript" in cygwin or msys bash session
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-40738
>                 URL: https://issues.apache.org/jira/browse/SPARK-40738
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell, Windows
>    Affects Versions: 3.3.0
>         Environment: The problem occurs in Windows if *_spark-shell_* is called from a bash session.
> NOTE: the fix also applies to _*spark-submit*_ and and {_}*beeline*{_}, since they call spark-shell.
>            Reporter: Phil Walker
>            Priority: Major
>              Labels: bash, cygwin, mingw, msys2,, windows
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> A spark pull request [spark PR|https://github.com/apache/spark/pull/38167] fixes this issue, and also fixes a build error that is also related to _*cygwin*_  and *msys/mingw* bash *sbt* sessions.
> If a Windows user tries to start a *_spark-shell_* session by calling the bash script (rather than the *_spark-shell.cmd_* script), it fails with a confusing error message.  Script _*spark-class*_ calls _*launcher/src/main/java/org/apache/spark/launcher/Main.java* to_ generate command line arguments, but the launcher produces a format appropriate to the *_.cmd_* version of the script rather than the _*bash*_ version.
> The launcher Main method, when called for environments other than Windows, interleaves NULL characters between the command line arguments.   It should also do so in Windows when called from the bash script.  It incorrectly assumes that if the OS is Windows, that it is being called by the .cmd version of the script.
> The resulting error message is unhelpful:
>  
> {code:java}
> [lots of ugly stuff omitted]
> /opt/spark/bin/spark-class: line 100: CMD: bad array subscript
> {code}
> The key to _*launcher/Main*_ knowing that a request is from a _*bash*_ session is that the _*SHELL*_ environment variable is set.   This will normally be set in any of the various Windows shell environments ({_}*cygwin*{_}, {_}*mingw64*{_}, {_}*msys2*{_}, etc) and will not normally be set in Windows environments.   In the _*spark-class.cmd*_ script, _*SHELL*_ is intentionally unset to avoid problems, and to permit bash users to call the _*.cmd*_ scripts if they prefer (it will still work as before).
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org