You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2020/03/31 06:18:05 UTC
[spark] branch branch-3.0 updated: [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new 1caca7d  [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound
1caca7d is described below

commit 1caca7d97a03ab9ac99597e1ef9fa3890da90743
Author: Ben Ryves <be...@getyourguide.com>
AuthorDate: Tue Mar 31 15:16:17 2020 +0900

    [SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound
    
    ### What changes were proposed in this pull request?
    A small documentation change to clarify that the `rand()` function produces values in `[0.0, 1.0)`.
    
    ### Why are the changes needed?
    `rand()` uses `Rand()` - which generates values in [0, 1) ([documented here](https://github.com/apache/spark/blob/a1dbcd13a3eeaee50cc1a46e909f9478d6d55177/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala#L71)). The existing documentation suggests that 1.0 is a possible value returned by rand (i.e for a distribution written as `X ~ U(a, b)`, x can be a or b, so `U[0.0, 1.0]` suggests the value returned could include 1.0).
    
    ### Does this PR introduce any user-facing change?
    Only documentation changes.
    
    ### How was this patch tested?
    Documentation changes only.
    
    Closes #28071 from Smeb/master.
    
    Authored-by: Ben Ryves <be...@getyourguide.com>
    Signed-off-by: HyukjinKwon <gu...@apache.org>
    (cherry picked from commit fa378567105ec9d9bbe30edf4b74b09c3df27658)
    Signed-off-by: HyukjinKwon <gu...@apache.org>
---
 R/pkg/R/functions.R                                          | 2 +-
 python/pyspark/sql/functions.py                              | 2 +-
 sql/core/src/main/scala/org/apache/spark/sql/functions.scala | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/R/pkg/R/functions.R b/R/pkg/R/functions.R
index d8b0450..173dbc4 100644
--- a/R/pkg/R/functions.R
+++ b/R/pkg/R/functions.R
@@ -2888,7 +2888,7 @@ setMethod("lpad", signature(x = "Column", len = "numeric", pad = "character"),
 
 #' @details
 #' \code{rand}: Generates a random column with independent and identically distributed (i.i.d.)
-#' samples from U[0.0, 1.0].
+#' samples uniformly distributed in [0.0, 1.0).
 #' Note: the function is non-deterministic in general case.
 #'
 #' @rdname column_nonaggregate_functions
diff --git a/python/pyspark/sql/functions.py b/python/pyspark/sql/functions.py
index 1ade21c..476aab4 100644
--- a/python/pyspark/sql/functions.py
+++ b/python/pyspark/sql/functions.py
@@ -599,7 +599,7 @@ def nanvl(col1, col2):
 @since(1.4)
 def rand(seed=None):
     """Generates a random column with independent and identically distributed (i.i.d.) samples
-    from U[0.0, 1.0].
+    uniformly distributed in [0.0, 1.0).
 
     .. note:: The function is non-deterministic in general case.
 
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
index 8a89a3b..fd4e77f 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
@@ -1204,7 +1204,7 @@ object functions {
 
   /**
    * Generate a random column with independent and identically distributed (i.i.d.) samples
-   * from U[0.0, 1.0].
+   * uniformly distributed in [0.0, 1.0).
    *
    * @note The function is non-deterministic in general case.
    *
@@ -1215,7 +1215,7 @@ object functions {
 
   /**
    * Generate a random column with independent and identically distributed (i.i.d.) samples
-   * from U[0.0, 1.0].
+   * uniformly distributed in [0.0, 1.0).
    *
    * @note The function is non-deterministic in general case.
    *


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org