You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2023/02/15 11:36:41 UTC

[spark] branch branch-3.4 updated: [SPARK-42446][DOCS][PYTHON] Updating PySpark documentation to enhance usability

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
     new 559dee3095e [SPARK-42446][DOCS][PYTHON] Updating PySpark documentation to enhance usability
559dee3095e is described below

commit 559dee3095eee412ba3e6561cb4d9a9094559b35
Author: Allan Folting <al...@databricks.com>
AuthorDate: Wed Feb 15 20:36:15 2023 +0900

    [SPARK-42446][DOCS][PYTHON] Updating PySpark documentation to enhance usability
    
    ### What changes were proposed in this pull request?
    Updates to the PySpark documentation web site:
    - Fixing typo on the Getting Started page (Version => Versions)
    - Capitalizing "In/Out" in the DataFrame Quick Start notebook
    - Adding "(Legacy)" to the Spark Streaming heading on the Spark Streaming page
    - Reorganizing the User Guide page to list PySpark guides first, minor language updates, and removing links to legacy streaming and RDD programming guides to not promote these as prominently and focus on the recommended APIs
    
    ### Why are the changes needed?
    To improve usability of the PySpark doc website by adding guidance (calling out legacy APIs), fixing a few language issues, and making PySpark content more prominent.
    
    ### Does this PR introduce _any_ user-facing change?
    Yes, the user facing PySpark documentation is updated.
    
    ### How was this patch tested?
    Built and manually reviewed/tested the PySpark documentation web site locally.
    
    Closes #40032 from allanf-db/pyspark_docs.
    
    Authored-by: Allan Folting <al...@databricks.com>
    Signed-off-by: Hyukjin Kwon <gu...@apache.org>
    (cherry picked from commit 00dc3d8533e518cea1bdd3cf9439e4ef0a14d600)
    Signed-off-by: Hyukjin Kwon <gu...@apache.org>
---
 python/docs/source/getting_started/install.rst     |  4 ++--
 .../source/getting_started/quickstart_df.ipynb     |  4 ++--
 python/docs/source/reference/pyspark.streaming.rst |  6 +++---
 python/docs/source/user_guide/index.rst            | 25 ++++++++++------------
 4 files changed, 18 insertions(+), 21 deletions(-)

diff --git a/python/docs/source/getting_started/install.rst b/python/docs/source/getting_started/install.rst
index eddee8e30e1..be2a1eae66d 100644
--- a/python/docs/source/getting_started/install.rst
+++ b/python/docs/source/getting_started/install.rst
@@ -27,8 +27,8 @@ This page includes instructions for installing PySpark by using pip, Conda, down
 and building from the source.
 
 
-Python Version Supported
-------------------------
+Python Versions Supported
+-------------------------
 
 Python 3.7 and above.
 
diff --git a/python/docs/source/getting_started/quickstart_df.ipynb b/python/docs/source/getting_started/quickstart_df.ipynb
index ae0ff37c452..af0e5ee5052 100644
--- a/python/docs/source/getting_started/quickstart_df.ipynb
+++ b/python/docs/source/getting_started/quickstart_df.ipynb
@@ -914,7 +914,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Getting Data in/out\n",
+    "## Getting Data In/Out\n",
     "\n",
     "CSV is straightforward and easy to use. Parquet and ORC are efficient and compact file formats to read and write faster.\n",
     "\n",
@@ -1174,4 +1174,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 1
-}
\ No newline at end of file
+}
diff --git a/python/docs/source/reference/pyspark.streaming.rst b/python/docs/source/reference/pyspark.streaming.rst
index 57cbd00b67e..b8bf7e3c016 100644
--- a/python/docs/source/reference/pyspark.streaming.rst
+++ b/python/docs/source/reference/pyspark.streaming.rst
@@ -16,9 +16,9 @@
     under the License.
 
 
-===============
-Spark Streaming
-===============
+========================
+Spark Streaming (Legacy)
+========================
 
 Core Classes
 ------------
diff --git a/python/docs/source/user_guide/index.rst b/python/docs/source/user_guide/index.rst
index 5cc8bc3d38e..67f8c8d4d0f 100644
--- a/python/docs/source/user_guide/index.rst
+++ b/python/docs/source/user_guide/index.rst
@@ -16,21 +16,12 @@
     under the License.
 
 
-==========
-User Guide
-==========
-
-There are basic guides shared with other languages in Programming Guides
-at `the Spark documentation <https://spark.apache.org/docs/latest/index.html#where-to-go-from-here>`_ as below:
-
-- `RDD Programming Guide <https://spark.apache.org/docs/latest/rdd-programming-guide.html>`_
-- `Spark SQL, DataFrames and Datasets Guide <https://spark.apache.org/docs/latest/sql-programming-guide.html>`_
-- `Structured Streaming Programming Guide <https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html>`_
-- `Spark Streaming Programming Guide <https://spark.apache.org/docs/latest/streaming-programming-guide.html>`_
-- `Machine Learning Library (MLlib) Guide <https://spark.apache.org/docs/latest/ml-guide.html>`_
-
-PySpark specific user guide is as follows:
+===========
+User Guides
+===========
 
+PySpark specific user guides are available here:
+ 
 .. toctree::
    :maxdepth: 2
 
@@ -38,3 +29,9 @@ PySpark specific user guide is as follows:
    sql/index
    pandas_on_spark/index
 
+There are also basic programming guides covering multiple languages available in
+`the Spark documentation <https://spark.apache.org/docs/latest/index.html#where-to-go-from-here>`_, including these:
+
+- `Spark SQL, DataFrames and Datasets Guide <https://spark.apache.org/docs/latest/sql-programming-guide.html>`_
+- `Structured Streaming Programming Guide <https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html>`_
+- `Machine Learning Library (MLlib) Guide <https://spark.apache.org/docs/latest/ml-guide.html>`_


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org