You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2021/01/08 00:29:23 UTC
[spark] branch branch-3.1 updated: [SPARK-34041][PYTHON][DOCS]
Miscellaneous cleanup for new PySpark documentation
This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.1 by this push:
new bfb42d4 [SPARK-34041][PYTHON][DOCS] Miscellaneous cleanup for new PySpark documentation
bfb42d4 is described below
commit bfb42d4f14db66b13d6a3791bec09f0bd8b397bc
Author: HyukjinKwon <gu...@apache.org>
AuthorDate: Fri Jan 8 09:28:31 2021 +0900
[SPARK-34041][PYTHON][DOCS] Miscellaneous cleanup for new PySpark documentation
### What changes were proposed in this pull request?
This PR proposes to:
- Add a link of quick start in PySpark docs into "Programming Guides" in Spark main docs
- `ML` / `MLlib` -> `MLlib (DataFrame-based)` / `MLlib (RDD-based)` in API reference page
- Mention other user guides as well because the guide such as [ML](http://spark.apache.org/docs/latest/ml-guide.html) and [SQL](http://spark.apache.org/docs/latest/sql-programming-guide.html).
- Mention other migration guides as well because PySpark can get affected by it.
### Why are the changes needed?
For better documentation.
### Does this PR introduce _any_ user-facing change?
It fixes user-facing docs. However, it's not released out yet.
### How was this patch tested?
Manually tested by running:
```bash
cd docs
SKIP_SCALADOC=1 SKIP_RDOC=1 SKIP_SQLDOC=1 jekyll serve --watch
```
Closes #31082 from HyukjinKwon/SPARK-34041.
Authored-by: HyukjinKwon <gu...@apache.org>
Signed-off-by: HyukjinKwon <gu...@apache.org>
(cherry picked from commit aa388cf3d0ff230eb0397876fe2db03bbe51658e)
Signed-off-by: HyukjinKwon <gu...@apache.org>
---
docs/_layouts/global.html | 1 +
docs/index.md | 2 ++
python/docs/source/getting_started/index.rst | 3 +++
python/docs/source/migration_guide/index.rst | 12 ++++++++++--
python/docs/source/reference/pyspark.ml.rst | 12 ++++++------
python/docs/source/reference/pyspark.mllib.rst | 4 ++--
python/docs/source/user_guide/index.rst | 12 ++++++++++++
7 files changed, 36 insertions(+), 10 deletions(-)
diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index de98f29..f10d467 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -84,6 +84,7 @@
<a class="dropdown-item" href="ml-guide.html">MLlib (Machine Learning)</a>
<a class="dropdown-item" href="graphx-programming-guide.html">GraphX (Graph Processing)</a>
<a class="dropdown-item" href="sparkr.html">SparkR (R on Spark)</a>
+ <a class="dropdown-item" href="api/python/getting_started/index.html">PySpark (Python on Spark)</a>
</div>
</li>
diff --git a/docs/index.md b/docs/index.md
index 8fd169e..c4c2d72 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -113,6 +113,8 @@ options for deployment:
* [Spark Streaming](streaming-programming-guide.html): processing data streams using DStreams (old API)
* [MLlib](ml-guide.html): applying machine learning algorithms
* [GraphX](graphx-programming-guide.html): processing graphs
+* [SparkR](sparkr.html): processing data with Spark in R
+* [PySpark](api/python/getting_started/index.html): processing data with Spark in Python
**API Docs:**
diff --git a/python/docs/source/getting_started/index.rst b/python/docs/source/getting_started/index.rst
index 9fa3352..38b9c93 100644
--- a/python/docs/source/getting_started/index.rst
+++ b/python/docs/source/getting_started/index.rst
@@ -21,6 +21,9 @@ Getting Started
===============
This page summarizes the basic steps required to setup and get started with PySpark.
+There are more guides shared with other languages such as
+`Quick Start <http://spark.apache.org/docs/latest/quick-start.html>`_ in Programming Guides
+at `the Spark documentation <http://spark.apache.org/docs/latest/index.html#where-to-go-from-here>`_.
.. toctree::
:maxdepth: 2
diff --git a/python/docs/source/migration_guide/index.rst b/python/docs/source/migration_guide/index.rst
index 41e36b1..88e768d 100644
--- a/python/docs/source/migration_guide/index.rst
+++ b/python/docs/source/migration_guide/index.rst
@@ -21,8 +21,6 @@ Migration Guide
===============
This page describes the migration guide specific to PySpark.
-Many items of other migration guides can also be applied when migrating PySpark to higher versions because PySpark internally shares other components.
-Please also refer other migration guides such as `Migration Guide: SQL, Datasets and DataFrame <http://spark.apache.org/docs/latest/sql-migration-guide.html>`_.
.. toctree::
:maxdepth: 2
@@ -33,3 +31,13 @@ Please also refer other migration guides such as `Migration Guide: SQL, Datasets
pyspark_2.2_to_2.3
pyspark_1.4_to_1.5
pyspark_1.0_1.2_to_1.3
+
+
+Many items of other migration guides can also be applied when migrating PySpark to higher versions because PySpark internally shares other components.
+Please also refer other migration guides:
+
+- `Migration Guide: Spark Core <http://spark.apache.org/docs/latest/core-migration-guide.html>`_
+- `Migration Guide: SQL, Datasets and DataFrame <http://spark.apache.org/docs/latest/sql-migration-guide.html>`_
+- `Migration Guide: Structured Streaming <http://spark.apache.org/docs/latest/ss-migration-guide.html>`_
+- `Migration Guide: MLlib (Machine Learning) <http://spark.apache.org/docs/latest/ml-migration-guide.html>`_
+
diff --git a/python/docs/source/reference/pyspark.ml.rst b/python/docs/source/reference/pyspark.ml.rst
index 2de0ff6..cc90459 100644
--- a/python/docs/source/reference/pyspark.ml.rst
+++ b/python/docs/source/reference/pyspark.ml.rst
@@ -16,11 +16,11 @@
under the License.
-ML
-==
+MLlib (DataFrame-based)
+=======================
-ML Pipeline APIs
-----------------
+Pipeline APIs
+-------------
.. currentmodule:: pyspark.ml
@@ -188,8 +188,8 @@ Clustering
PowerIterationClustering
-ML Functions
-----------------------------
+Functions
+---------
.. currentmodule:: pyspark.ml.functions
diff --git a/python/docs/source/reference/pyspark.mllib.rst b/python/docs/source/reference/pyspark.mllib.rst
index df5ea01..12fc479 100644
--- a/python/docs/source/reference/pyspark.mllib.rst
+++ b/python/docs/source/reference/pyspark.mllib.rst
@@ -16,8 +16,8 @@
under the License.
-MLlib
-=====
+MLlib (RDD-based)
+=================
Classification
--------------
diff --git a/python/docs/source/user_guide/index.rst b/python/docs/source/user_guide/index.rst
index 3e535ce..704156b 100644
--- a/python/docs/source/user_guide/index.rst
+++ b/python/docs/source/user_guide/index.rst
@@ -20,9 +20,21 @@
User Guide
==========
+This page is the guide for PySpark users which contains PySpark specific topics.
+
.. toctree::
:maxdepth: 2
arrow_pandas
python_packaging
+
+There are more guides shared with other languages in Programming Guides
+at `the Spark documentation <http://spark.apache.org/docs/latest/index.html#where-to-go-from-here>`_.
+
+- `RDD Programming Guide <http://spark.apache.org/docs/latest/rdd-programming-guide.html>`_
+- `Spark SQL, DataFrames and Datasets Guide <http://spark.apache.org/docs/latest/sql-programming-guide.html>`_
+- `Structured Streaming Programming Guide <http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html>`_
+- `Spark Streaming Programming Guide <http://spark.apache.org/docs/latest/streaming-programming-guide.html>`_
+- `Machine Learning Library (MLlib) Guide <http://spark.apache.org/docs/latest/ml-guide.html>`_
+
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org