You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/17 15:28:28 UTC

[GitHub] [spark] srowen commented on a change in pull request #29779: [SPARK-32180][PYTHON][DOCS][FOLLOW-UP] Rephrase and add some more information in installation guide

srowen commented on a change in pull request #29779:
URL: https://github.com/apache/spark/pull/29779#discussion_r490342386



##########
File path: python/docs/source/getting_started/install.rst
##########
@@ -19,71 +19,95 @@
 Installation
 ============
 
-Official releases are available from the `Apache Spark website <https://spark.apache.org/downloads.html>`_.
-Alternatively, you can install it via ``pip`` from PyPI.  PyPI installation is usually for standalone
-locally or as a client to connect to a cluster instead of setting a cluster up.  
+PySpark is included in the official releases of Spark available in the `Apache Spark website <https://spark.apache.org/downloads.html>`_.
+For Python users, PySpark also provides ``pip`` installation from PyPI. This is usually for standalone
+locally or as a client to connect to a cluster instead of setting a cluster itself up.

Review comment:
       standalone locally -> local usage?
   "instead of setting up a cluster itself" maybe

##########
File path: python/docs/source/getting_started/install.rst
##########
@@ -19,71 +19,95 @@
 Installation
 ============
 
-Official releases are available from the `Apache Spark website <https://spark.apache.org/downloads.html>`_.
-Alternatively, you can install it via ``pip`` from PyPI.  PyPI installation is usually for standalone
-locally or as a client to connect to a cluster instead of setting a cluster up.  
+PySpark is included in the official releases of Spark available in the `Apache Spark website <https://spark.apache.org/downloads.html>`_.
+For Python users, PySpark also provides ``pip`` installation from PyPI. This is usually for standalone
+locally or as a client to connect to a cluster instead of setting a cluster itself up.
  
-This page includes the instructions for installing PySpark by using pip, Conda, downloading manually, and building it from the source.
+This page includes the instructions for installing PySpark by using pip, Conda, downloading manually,
+and building it from the source.
+
 
 Python Version Supported
 ------------------------
 
 Python 3.6 and above.
 
+
 Using PyPI
 ----------
 
-PySpark installation using `PyPI <https://pypi.org/project/pyspark/>`_
+PySpark installation using `PyPI <https://pypi.org/project/pyspark/>`_ is as follows:
 
 .. code-block:: bash
 
     pip install pyspark
-	
-Using Conda  
+
+If you want to install extra dependencies for a specific componenet, you can install it as below:
+
+.. code-block:: bash
+
+    pip install pyspark[sql]
+
+
+
+Using Conda
 -----------
 
-Conda is an open-source package management and environment management system which is a part of the `Anaconda <https://docs.continuum.io/anaconda/>`_ distribution. It is both cross-platform and language agnostic.
-  
-Conda can be used to create a virtual environment from terminal as shown below:
+Conda is an open-source package management and environment management system which is a part of
+the `Anaconda <https://docs.continuum.io/anaconda/>`_ distribution. It is both cross-platform and
+language agnostic. In practice, Conda can replace both `pip <https://pip.pypa.io/en/latest/>`_ and
+`virtualenv <https://virtualenv.pypa.io/en/latest/>`_.
+
+Create new virtual environment from your terminal as shown below:
 
 .. code-block:: bash
 
-    conda create -n pyspark_env 
+    conda create -n pyspark_env
 
-After the virtual environment is created, it should be visible under the list of Conda environments which can be seen using the following command:
+After the virtual environment is created, it should be visible under the list of Conda environments
+which can be seen using the following command:
 
 .. code-block:: bash
 
     conda env list
 
-The newly created environment can be accessed using the following command:
+Now activate the the newly created environment by the following command:

Review comment:
       "the the" -> the
   by -> with

##########
File path: python/docs/source/getting_started/install.rst
##########
@@ -19,71 +19,95 @@
 Installation
 ============
 
-Official releases are available from the `Apache Spark website <https://spark.apache.org/downloads.html>`_.
-Alternatively, you can install it via ``pip`` from PyPI.  PyPI installation is usually for standalone
-locally or as a client to connect to a cluster instead of setting a cluster up.  
+PySpark is included in the official releases of Spark available in the `Apache Spark website <https://spark.apache.org/downloads.html>`_.
+For Python users, PySpark also provides ``pip`` installation from PyPI. This is usually for standalone
+locally or as a client to connect to a cluster instead of setting a cluster itself up.
  
-This page includes the instructions for installing PySpark by using pip, Conda, downloading manually, and building it from the source.
+This page includes the instructions for installing PySpark by using pip, Conda, downloading manually,

Review comment:
       "includes instructions"
   "building from source"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org