You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by we...@apache.org on 2022/01/05 05:01:30 UTC

[spark] branch branch-3.2 updated: [SPARK-30789][SQL][DOCS][FOLLOWUP] Add document for syntax `(IGNORE | RESPECT) NULLS`

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
     new 9b6be1a  [SPARK-30789][SQL][DOCS][FOLLOWUP] Add document for syntax `(IGNORE | RESPECT) NULLS`
9b6be1a is described below

commit 9b6be1a6c004e50ffdf59f7fa1986adeb03e45cd
Author: Jiaan Geng <be...@163.com>
AuthorDate: Wed Jan 5 12:57:21 2022 +0800

    [SPARK-30789][SQL][DOCS][FOLLOWUP] Add document for syntax `(IGNORE | RESPECT) NULLS`
    
    ### What changes were proposed in this pull request?
    https://github.com/apache/spark/pull/30943 supports syntax `(IGNORE | RESPECT) NULLS for LEAD/LAG/NTH_VALUE/FIRST_VALUE/LAST_VALUE`, but update document.
    The screen snapshot before this PR
    ![screenshot-20211231-174803](https://user-images.githubusercontent.com/8486025/147816336-debca074-0b84-48e8-9ed2-cb13f562cf12.png)
    
    This PR adds document for syntax `(IGNORE | RESPECT) NULLS`
    
    The screen snapshot after this PR
    ![image](https://user-images.githubusercontent.com/8486025/148141568-506e9232-a3c4-4a25-a5c6-65a5d5a2e066.png)
    
    ![image](https://user-images.githubusercontent.com/8486025/148061495-b7198417-9d4c-4c03-9060-385271ea9a46.png)
    
    ### Why are the changes needed?
    Add document for syntax `(IGNORE | RESPECT) NULLS`
    
    ### Does this PR introduce _any_ user-facing change?
    'No'. Just update docs.
    
    ### How was this patch tested?
    Manual check.
    
    Closes #35079 from beliefer/SPARK-30789-docs.
    
    Authored-by: Jiaan Geng <be...@163.com>
    Signed-off-by: Wenchen Fan <we...@databricks.com>
    (cherry picked from commit 93c614bf1e6aba092d82bcd8616b5ea31eb191a2)
    Signed-off-by: Wenchen Fan <we...@databricks.com>
---
 docs/sql-ref-syntax-qry-select-window.md | 37 ++++++++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-qry-select-window.md b/docs/sql-ref-syntax-qry-select-window.md
index a1c2b18..6e65778 100644
--- a/docs/sql-ref-syntax-qry-select-window.md
+++ b/docs/sql-ref-syntax-qry-select-window.md
@@ -26,7 +26,7 @@ Window functions operate on a group of rows, referred to as a window, and calcul
 ### Syntax
 
 ```sql
-window_function OVER
+window_function [ nulls_option ] OVER
 ( [  { PARTITION | DISTRIBUTE } BY partition_col_name = partition_col_val ( [ , ... ] ) ]
   { ORDER | SORT } BY expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ]
   [ window_frame ] )
@@ -42,7 +42,7 @@ window_function OVER
 
     * Analytic Functions
 
-      **Syntax:** `CUME_DIST | LAG | LEAD`
+      **Syntax:** `CUME_DIST | LAG | LEAD | NTH_VALUE | FIRST_VALUE | LAST_VALUE`
 
     * Aggregate Functions
 
@@ -50,6 +50,16 @@ window_function OVER
 
       Please refer to the [Built-in Aggregation Functions](sql-ref-functions-builtin.html#aggregate-functions) document for a complete list of Spark aggregate functions.
 
+* **nulls_option**
+
+    Specifies whether or not to skip null values when evaluating the window function. `RESECT NULLS` means not skipping null values, while `IGNORE NULLS` means skipping. If not specified, the default is `RESECT NULLS`.
+
+    **Syntax:**
+
+    `{ IGNORE | RESPECT } NULLS`
+
+    **Note:** Only `LAG | LEAD | NTH_VALUE | FIRST_VALUE | LAST_VALUE` can be used with `IGNORE NULLS`.
+
 * **window_frame**
 
     Specifies which row to start the window on and where to end it.
@@ -184,6 +194,29 @@ SELECT name, salary,
 | Jane|  Marketing| 29000|29000|35000|
 | Jeff|  Marketing| 35000|29000|    0|
 +-----+-----------+------+-----+-----+
+
+SELECT id, v,
+    LEAD(v, 0) IGNORE NULLS OVER w lead,
+    LAG(v, 0) IGNORE NULLS OVER w lag,
+    NTH_VALUE(v, 2) IGNORE NULLS OVER w nth_value,
+    FIRST_VALUE(v) IGNORE NULLS OVER w first_value,
+    LAST_VALUE(v) IGNORE NULLS OVER w last_value
+    FROM test_ignore_null
+    WINDOW w AS (ORDER BY id)
+    ORDER BY id;
++--+----+----+----+---------+-----------+----------+
+|id|   v|lead| lag|nth_value|first_value|last_value|
++--+----+----+----+---------+-----------+----------+
+| 0|NULL|NULL|NULL|     NULL|       NULL|      NULL|
+| 1|   x|   x|   x|     NULL|          x|         x|
+| 2|NULL|NULL|NULL|     NULL|          x|         x|
+| 3|NULL|NULL|NULL|     NULL|          x|         x|
+| 4|   y|   y|   y|        y|          x|         y|
+| 5|NULL|NULL|NULL|        y|          x|         y|
+| 6|   z|   z|   z|        y|          x|         z|
+| 7|   v|   v|   v|        y|          x|         v|
+| 8|NULL|NULL|NULL|        y|          x|         v|
++--+----+----+----+---------+-----------+----------+
 ```
 
 ### Related Statements

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org