You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2021/01/08 00:29:23 UTC

[spark] branch branch-3.1 updated: [SPARK-34041][PYTHON][DOCS] Miscellaneous cleanup for new PySpark documentation

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
     new bfb42d4  [SPARK-34041][PYTHON][DOCS] Miscellaneous cleanup for new PySpark documentation
bfb42d4 is described below

commit bfb42d4f14db66b13d6a3791bec09f0bd8b397bc
Author: HyukjinKwon <gu...@apache.org>
AuthorDate: Fri Jan 8 09:28:31 2021 +0900

    [SPARK-34041][PYTHON][DOCS] Miscellaneous cleanup for new PySpark documentation
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to:
    - Add a link of quick start in PySpark docs into "Programming Guides" in Spark main docs
    - `ML` / `MLlib` -> `MLlib (DataFrame-based)` / `MLlib (RDD-based)` in API reference page
    - Mention other user guides as well because the guide such as [ML](http://spark.apache.org/docs/latest/ml-guide.html) and [SQL](http://spark.apache.org/docs/latest/sql-programming-guide.html).
    - Mention other migration guides as well because PySpark can get affected by it.
    
    ### Why are the changes needed?
    
    For better documentation.
    
    ### Does this PR introduce _any_ user-facing change?
    
    It fixes user-facing docs. However, it's not released out yet.
    
    ### How was this patch tested?
    
    Manually tested by running:
    
    ```bash
    cd docs
    SKIP_SCALADOC=1 SKIP_RDOC=1 SKIP_SQLDOC=1 jekyll serve --watch
    ```
    
    Closes #31082 from HyukjinKwon/SPARK-34041.
    
    Authored-by: HyukjinKwon <gu...@apache.org>
    Signed-off-by: HyukjinKwon <gu...@apache.org>
    (cherry picked from commit aa388cf3d0ff230eb0397876fe2db03bbe51658e)
    Signed-off-by: HyukjinKwon <gu...@apache.org>
---
 docs/_layouts/global.html                      |  1 +
 docs/index.md                                  |  2 ++
 python/docs/source/getting_started/index.rst   |  3 +++
 python/docs/source/migration_guide/index.rst   | 12 ++++++++++--
 python/docs/source/reference/pyspark.ml.rst    | 12 ++++++------
 python/docs/source/reference/pyspark.mllib.rst |  4 ++--
 python/docs/source/user_guide/index.rst        | 12 ++++++++++++
 7 files changed, 36 insertions(+), 10 deletions(-)

diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index de98f29..f10d467 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -84,6 +84,7 @@
                                 <a class="dropdown-item" href="ml-guide.html">MLlib (Machine Learning)</a>
                                 <a class="dropdown-item" href="graphx-programming-guide.html">GraphX (Graph Processing)</a>
                                 <a class="dropdown-item" href="sparkr.html">SparkR (R on Spark)</a>
+                                <a class="dropdown-item" href="api/python/getting_started/index.html">PySpark (Python on Spark)</a>
                             </div>
                         </li>
 
diff --git a/docs/index.md b/docs/index.md
index 8fd169e..c4c2d72 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -113,6 +113,8 @@ options for deployment:
 * [Spark Streaming](streaming-programming-guide.html): processing data streams using DStreams (old API)
 * [MLlib](ml-guide.html): applying machine learning algorithms
 * [GraphX](graphx-programming-guide.html): processing graphs 
+* [SparkR](sparkr.html): processing data with Spark in R
+* [PySpark](api/python/getting_started/index.html): processing data with Spark in Python
 
 **API Docs:**
 
diff --git a/python/docs/source/getting_started/index.rst b/python/docs/source/getting_started/index.rst
index 9fa3352..38b9c93 100644
--- a/python/docs/source/getting_started/index.rst
+++ b/python/docs/source/getting_started/index.rst
@@ -21,6 +21,9 @@ Getting Started
 ===============
 
 This page summarizes the basic steps required to setup and get started with PySpark.
+There are more guides shared with other languages such as
+`Quick Start <http://spark.apache.org/docs/latest/quick-start.html>`_ in Programming Guides
+at `the Spark documentation <http://spark.apache.org/docs/latest/index.html#where-to-go-from-here>`_.
 
 .. toctree::
     :maxdepth: 2
diff --git a/python/docs/source/migration_guide/index.rst b/python/docs/source/migration_guide/index.rst
index 41e36b1..88e768d 100644
--- a/python/docs/source/migration_guide/index.rst
+++ b/python/docs/source/migration_guide/index.rst
@@ -21,8 +21,6 @@ Migration Guide
 ===============
 
 This page describes the migration guide specific to PySpark.
-Many items of other migration guides can also be applied when migrating PySpark to higher versions because PySpark internally shares other components.
-Please also refer other migration guides such as `Migration Guide: SQL, Datasets and DataFrame <http://spark.apache.org/docs/latest/sql-migration-guide.html>`_.
 
 .. toctree::
    :maxdepth: 2
@@ -33,3 +31,13 @@ Please also refer other migration guides such as `Migration Guide: SQL, Datasets
    pyspark_2.2_to_2.3
    pyspark_1.4_to_1.5
    pyspark_1.0_1.2_to_1.3
+
+
+Many items of other migration guides can also be applied when migrating PySpark to higher versions because PySpark internally shares other components.
+Please also refer other migration guides:
+
+- `Migration Guide: Spark Core <http://spark.apache.org/docs/latest/core-migration-guide.html>`_
+- `Migration Guide: SQL, Datasets and DataFrame <http://spark.apache.org/docs/latest/sql-migration-guide.html>`_
+- `Migration Guide: Structured Streaming <http://spark.apache.org/docs/latest/ss-migration-guide.html>`_
+- `Migration Guide: MLlib (Machine Learning) <http://spark.apache.org/docs/latest/ml-migration-guide.html>`_
+
diff --git a/python/docs/source/reference/pyspark.ml.rst b/python/docs/source/reference/pyspark.ml.rst
index 2de0ff6..cc90459 100644
--- a/python/docs/source/reference/pyspark.ml.rst
+++ b/python/docs/source/reference/pyspark.ml.rst
@@ -16,11 +16,11 @@
     under the License.
 
 
-ML
-==
+MLlib (DataFrame-based)
+=======================
 
-ML Pipeline APIs
-----------------
+Pipeline APIs
+-------------
 
 .. currentmodule:: pyspark.ml
 
@@ -188,8 +188,8 @@ Clustering
     PowerIterationClustering
 
 
-ML Functions
-----------------------------
+Functions
+---------
 
 .. currentmodule:: pyspark.ml.functions
 
diff --git a/python/docs/source/reference/pyspark.mllib.rst b/python/docs/source/reference/pyspark.mllib.rst
index df5ea01..12fc479 100644
--- a/python/docs/source/reference/pyspark.mllib.rst
+++ b/python/docs/source/reference/pyspark.mllib.rst
@@ -16,8 +16,8 @@
     under the License.
 
 
-MLlib
-=====
+MLlib (RDD-based)
+=================
 
 Classification
 --------------
diff --git a/python/docs/source/user_guide/index.rst b/python/docs/source/user_guide/index.rst
index 3e535ce..704156b 100644
--- a/python/docs/source/user_guide/index.rst
+++ b/python/docs/source/user_guide/index.rst
@@ -20,9 +20,21 @@
 User Guide
 ==========
 
+This page is the guide for PySpark users which contains PySpark specific topics.
+
 .. toctree::
     :maxdepth: 2
 
     arrow_pandas
     python_packaging
 
+
+There are more guides shared with other languages in Programming Guides
+at `the Spark documentation <http://spark.apache.org/docs/latest/index.html#where-to-go-from-here>`_.
+
+- `RDD Programming Guide <http://spark.apache.org/docs/latest/rdd-programming-guide.html>`_
+- `Spark SQL, DataFrames and Datasets Guide <http://spark.apache.org/docs/latest/sql-programming-guide.html>`_
+- `Structured Streaming Programming Guide <http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html>`_
+- `Spark Streaming Programming Guide <http://spark.apache.org/docs/latest/streaming-programming-guide.html>`_
+- `Machine Learning Library (MLlib) Guide <http://spark.apache.org/docs/latest/ml-guide.html>`_
+


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org