You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Stefaan Lippens (JIRA)" <ji...@apache.org> on 2019/02/06 09:36:00 UTC

[jira] [Commented] (SPARK-26831) bin/pyspark: avoid hardcoded `python` command and improve version checks

    [ https://issues.apache.org/jira/browse/SPARK-26831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761615#comment-16761615 ] 

Stefaan Lippens commented on SPARK-26831:
-----------------------------------------

As an initial improvement I propose to move the definition of {{WORKS_WITH_IPYTHON}} inside the {{if [[ -z "$PYSPARK_PYTHON" ]]; then}} body and fix the bash syntax:
{code}
diff --git a/bin/pyspark b/bin/pyspark
index 1dcddcc619..3174dd0049 100755
--- a/bin/pyspark
+++ b/bin/pyspark
@@ -42,11 +42,10 @@ if [[ -z "$PYSPARK_DRIVER_PYTHON" ]]; then
   PYSPARK_DRIVER_PYTHON="${PYSPARK_PYTHON:-"python"}"
 fi

-WORKS_WITH_IPYTHON=$(python -c 'import sys; print(sys.version_info >= (2, 7, 0))')
-
 # Determine the Python executable to use for the executors:
 if [[ -z "$PYSPARK_PYTHON" ]]; then
-  if [[ $PYSPARK_DRIVER_PYTHON == *ipython* && ! $WORKS_WITH_IPYTHON ]]; then
+  WORKS_WITH_IPYTHON=$(python -c 'import sys; print(sys.version_info >= (2, 7, 0))')
+  if [[ $PYSPARK_DRIVER_PYTHON == *ipython* && $WORKS_WITH_IPYTHON != "True" ]]; then
     echo "IPython requires Python 2.7+; please install python2.7 or set PYSPARK_PYTHON" 1>&2
     exit 1
   else
{code}

this is current state of pull request https://github.com/apache/spark/pull/23736

> bin/pyspark: avoid hardcoded `python` command and improve version checks
> ------------------------------------------------------------------------
>
>                 Key: SPARK-26831
>                 URL: https://issues.apache.org/jira/browse/SPARK-26831
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 2.4.0
>            Reporter: Stefaan Lippens
>            Priority: Major
>
> (this originally started at https://github.com/apache/spark/pull/23736)
> I was trying out pyspark on a system with only a {{python3}}  command but no {{python}} command and got this error:
> {code}
> /opt/spark/bin/pyspark: line 45: python: command not found
> {code}
> While the pyspark script is full of variables to refer to a python interpreter there is still a hardcoded {{python}} used for
> {code}
> WORKS_WITH_IPYTHON=$(python -c 'import sys; print(sys.version_info >= (2, 7, 0))')
> {code}
> While looking into this, I also noticed the bash syntax for the IPython version check is wrong: 
> {code}
> if [[ ! $WORKS_WITH_IPYTHON ]]
> {code}
> always evaluates to false when {{$WORKS_WITH_IPYTHON}} is non-empty (so in both cases "True" and "False")



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org