You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2016/04/30 11:15:13 UTC

spark git commit: [SPARK-13973][PYSPARK] Make pyspark fail noisily if IPYTHON or IPYTHON_OPTS are set

Repository: spark
Updated Branches:
  refs/heads/master 8dc3987d0 -> 0368ff30d


[SPARK-13973][PYSPARK] Make pyspark fail noisily if IPYTHON or IPYTHON_OPTS are set

## What changes were proposed in this pull request?

https://issues.apache.org/jira/browse/SPARK-13973

Following discussion with srowen the IPYTHON and IPYTHON_OPTS variables are removed. If they are set in the user's environment, pyspark will not execute and prints an error message. Failing noisily will force users to remove these options and learn the new configuration scheme, which is much more sustainable and less confusing.

## How was this patch tested?

Manual testing; set IPYTHON=1 and verified that the error message prints.

Author: pshearer <ps...@massmutual.com>
Author: shearerp <sh...@umich.edu>

Closes #12528 from shearerp/master.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0368ff30
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0368ff30
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0368ff30

Branch: refs/heads/master
Commit: 0368ff30dd55dd2127d4cb196898c7bd437e9d28
Parents: 8dc3987
Author: pshearer <ps...@massmutual.com>
Authored: Sat Apr 30 10:15:20 2016 +0100
Committer: Sean Owen <so...@cloudera.com>
Committed: Sat Apr 30 10:15:20 2016 +0100

----------------------------------------------------------------------
 bin/pyspark               | 32 ++++++++++++--------------------
 docs/programming-guide.md | 11 ++++++-----
 2 files changed, 18 insertions(+), 25 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/0368ff30/bin/pyspark
----------------------------------------------------------------------
diff --git a/bin/pyspark b/bin/pyspark
index a257499..d1fe75a 100755
--- a/bin/pyspark
+++ b/bin/pyspark
@@ -24,17 +24,11 @@ fi
 source "${SPARK_HOME}"/bin/load-spark-env.sh
 export _SPARK_CMD_USAGE="Usage: ./bin/pyspark [options]"
 
-# In Spark <= 1.1, setting IPYTHON=1 would cause the driver to be launched using the `ipython`
-# executable, while the worker would still be launched using PYSPARK_PYTHON.
-#
-# In Spark 1.2, we removed the documentation of the IPYTHON and IPYTHON_OPTS variables and added
-# PYSPARK_DRIVER_PYTHON and PYSPARK_DRIVER_PYTHON_OPTS to allow IPython to be used for the driver.
-# Now, users can simply set PYSPARK_DRIVER_PYTHON=ipython to use IPython and set
-# PYSPARK_DRIVER_PYTHON_OPTS to pass options when starting the Python driver
+# In Spark 2.0, IPYTHON and IPYTHON_OPTS are removed and pyspark fails to launch if either option
+# is set in the user's environment. Instead, users should set PYSPARK_DRIVER_PYTHON=ipython 
+# to use IPython and set PYSPARK_DRIVER_PYTHON_OPTS to pass options when starting the Python driver
 # (e.g. PYSPARK_DRIVER_PYTHON_OPTS='notebook').  This supports full customization of the IPython
 # and executor Python executables.
-#
-# For backwards-compatibility, we retain the old IPYTHON and IPYTHON_OPTS variables.
 
 # Determine the Python executable to use if PYSPARK_PYTHON or PYSPARK_DRIVER_PYTHON isn't set:
 if hash python2.7 2>/dev/null; then
@@ -44,17 +38,15 @@ else
   DEFAULT_PYTHON="python"
 fi
 
-# Determine the Python executable to use for the driver:
-if [[ -n "$IPYTHON_OPTS" || "$IPYTHON" == "1" ]]; then
-  # If IPython options are specified, assume user wants to run IPython
-  # (for backwards-compatibility)
-  PYSPARK_DRIVER_PYTHON_OPTS="$PYSPARK_DRIVER_PYTHON_OPTS $IPYTHON_OPTS"
-  if [ -x "$(command -v jupyter)" ]; then
-    PYSPARK_DRIVER_PYTHON="jupyter"
-  else
-    PYSPARK_DRIVER_PYTHON="ipython"
-  fi
-elif [[ -z "$PYSPARK_DRIVER_PYTHON" ]]; then
+# Fail noisily if removed options are set
+if [[ -n "$IPYTHON" || -n "$IPYTHON_OPTS" ]]; then
+  echo "Error in pyspark startup:" 
+  echo "IPYTHON and IPYTHON_OPTS are removed in Spark 2.0+. Remove these from the environment and set PYSPARK_DRIVER_PYTHON and PYSPARK_DRIVER_PYTHON_OPTS instead."
+  exit 1
+fi
+
+# Default to standard python interpreter unless told otherwise
+if [[ -z "$PYSPARK_DRIVER_PYTHON" ]]; then
   PYSPARK_DRIVER_PYTHON="${PYSPARK_PYTHON:-"$DEFAULT_PYTHON"}"
 fi
 

http://git-wip-us.apache.org/repos/asf/spark/blob/0368ff30/docs/programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 601dd57..cf6f1d8 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -240,16 +240,17 @@ use IPython, set the `PYSPARK_DRIVER_PYTHON` variable to `ipython` when running
 $ PYSPARK_DRIVER_PYTHON=ipython ./bin/pyspark
 {% endhighlight %}
 
-You can customize the `ipython` command by setting `PYSPARK_DRIVER_PYTHON_OPTS`. For example, to launch
-the [IPython Notebook](http://ipython.org/notebook.html) with PyLab plot support:
+To use the Jupyter notebook (previously known as the IPython notebook), 
 
 {% highlight bash %}
-$ PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark
+$ PYSPARK_DRIVER_PYTHON=jupyter ./bin/pyspark
 {% endhighlight %}
 
-After the IPython Notebook server is launched, you can create a new "Python 2" notebook from
+You can customize the `ipython` or `jupyter` commands by setting `PYSPARK_DRIVER_PYTHON_OPTS`. 
+
+After the Jupyter Notebook server is launched, you can create a new "Python 2" notebook from
 the "Files" tab. Inside the notebook, you can input the command `%pylab inline` as part of
-your notebook before you start to try Spark from the IPython notebook.
+your notebook before you start to try Spark from the Jupyter notebook.
 
 </div>
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org