You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ru...@apache.org on 2023/09/20 07:28:11 UTC
[spark] branch master updated: [SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn`
This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new ad18393d2f1e [SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn`
ad18393d2f1e is described below
commit ad18393d2f1e411e7898cea14af550360dfc8670
Author: panbingkun <pb...@gmail.com>
AuthorDate: Wed Sep 20 15:27:46 2023 +0800
[SPARK-45226][PYTHON][DOCS] Refine docstring of `rand/randn`
### What changes were proposed in this pull request?
The pr aims to refine docstring of `rand/randn`.
### Why are the changes needed?
- We need to add a call without seed in the example, and then skip it in the `doctest`.
- To improve PySpark documentation.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
- Manually test.
- Pass GA.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #43003 from panbingkun/SPARK-45226.
Authored-by: panbingkun <pb...@gmail.com>
Signed-off-by: Ruifeng Zheng <ru...@apache.org>
---
python/pyspark/sql/functions.py | 36 ++++++++++++++++++++++++++++++------
1 file changed, 30 insertions(+), 6 deletions(-)
diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py
index 5474873df7b2..83049124bdb2 100644
--- a/python/pyspark/sql/functions.py
+++ b/python/pyspark/sql/functions.py
@@ -5243,16 +5243,28 @@ def rand(seed: Optional[int] = None) -> Column:
Parameters
----------
seed : int (default: None)
- seed value for random generator.
+ Seed value for the random generator.
Returns
-------
:class:`~pyspark.sql.Column`
- random values.
+ A column of random values.
Examples
--------
+ Example 1: Generate a random column without a seed
+
>>> from pyspark.sql import functions as sf
+ >>> spark.range(0, 2, 1, 1).withColumn('rand', sf.rand()).show() # doctest: +SKIP
+ +---+-------------------+
+ | id| rand|
+ +---+-------------------+
+ | 0|0.14879325244215424|
+ | 1| 0.4640631044275454|
+ +---+-------------------+
+
+ Example 2: Generate a random column with a specific seed
+
>>> spark.range(0, 2, 1, 1).withColumn('rand', sf.rand(seed=42) * 3).show()
+---+------------------+
| id| rand|
@@ -5269,8 +5281,8 @@ def rand(seed: Optional[int] = None) -> Column:
@_try_remote_functions
def randn(seed: Optional[int] = None) -> Column:
- """Generates a column with independent and identically distributed (i.i.d.) samples from
- the standard normal distribution.
+ """Generates a random column with independent and identically distributed (i.i.d.) samples
+ from the standard normal distribution.
.. versionadded:: 1.4.0
@@ -5284,16 +5296,28 @@ def randn(seed: Optional[int] = None) -> Column:
Parameters
----------
seed : int (default: None)
- seed value for random generator.
+ Seed value for the random generator.
Returns
-------
:class:`~pyspark.sql.Column`
- random values.
+ A column of random values.
Examples
--------
+ Example 1: Generate a random column without a seed
+
>>> from pyspark.sql import functions as sf
+ >>> spark.range(0, 2, 1, 1).withColumn('randn', sf.randn()).show() # doctest: +SKIP
+ +---+--------------------+
+ | id| randn|
+ +---+--------------------+
+ | 0|-0.45011372342934214|
+ | 1| 0.6567304165329736|
+ +---+--------------------+
+
+ Example 2: Generate a random column with a specific seed
+
>>> spark.range(0, 2, 1, 1).withColumn('randn', sf.randn(seed=42)).show()
+---+------------------+
| id| randn|
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org