You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2020/04/10 00:43:33 UTC

[spark] branch master updated: [SPARK-31355][SQL][DOCS] Document TABLESAMPLE in SQL Reference

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new f69b0ef  [SPARK-31355][SQL][DOCS] Document TABLESAMPLE in SQL Reference
f69b0ef is described below

commit f69b0ef25df744f83c57c50735e8db9f6802e98c
Author: Huaxin Gao <hu...@us.ibm.com>
AuthorDate: Thu Apr 9 19:39:34 2020 -0500

    [SPARK-31355][SQL][DOCS] Document TABLESAMPLE in SQL Reference
    
    ### What changes were proposed in this pull request?
    Document TABLESAMPLE in SQL Reference
    
    ### Why are the changes needed?
    To make SQL Reference complete
    
    ### Does this PR introduce any user-facing change?
    Yes
    
    <img width="1049" alt="Screen Shot 2020-04-06 at 10 23 52 PM" src="https://user-images.githubusercontent.com/13592258/78633123-96749f00-7855-11ea-9509-b7ee21da7fbd.png">
    
    <img width="1050" alt="Screen Shot 2020-04-06 at 10 24 26 PM" src="https://user-images.githubusercontent.com/13592258/78633130-98d6f900-7855-11ea-8675-fd4b6163dfb6.png">
    
    ### How was this patch tested?
    Manually build and check.
    
    Closes #28130 from huaxingao/sampling.
    
    Authored-by: Huaxin Gao <hu...@us.ibm.com>
    Signed-off-by: Sean Owen <sr...@gmail.com>
---
 docs/_data/menu-sql.yaml            |  2 +
 docs/sql-ref-syntax-qry-sampling.md | 74 ++++++++++++++++++++++++++++++++++++-
 docs/sql-ref-syntax-qry-select.md   |  1 +
 3 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/docs/_data/menu-sql.yaml b/docs/_data/menu-sql.yaml
index 6f300e2..225450b 100644
--- a/docs/_data/menu-sql.yaml
+++ b/docs/_data/menu-sql.yaml
@@ -158,6 +158,8 @@
                   url: sql-ref-syntax-qry-select-hints.html
                 - text: Set Operators
                   url: sql-ref-syntax-qry-select-setops.html
+                - text: TABLESAMPLE
+                  url: sql-ref-syntax-qry-sampling.html
             - text: EXPLAIN
               url: sql-ref-syntax-qry-explain.html
         - text: Auxiliary Statements
diff --git a/docs/sql-ref-syntax-qry-sampling.md b/docs/sql-ref-syntax-qry-sampling.md
index a5efb36..061f21c 100644
--- a/docs/sql-ref-syntax-qry-sampling.md
+++ b/docs/sql-ref-syntax-qry-sampling.md
@@ -19,4 +19,76 @@ license: |
   limitations under the License.
 ---
 
-**This page is under construction**
+### Description
+
+The `TABLESAMPLE` statement is used to sample the table. It supports the following sampling methods:
+  * `TABLESAMPLE`(x `ROWS`): Sample the table down to the given number of rows.
+  * `TABLESAMPLE`(x `PERCENT`): Sample the table down to the given percentage. Note that percentages are defined as a number between 0 and 100.
+  * `TABLESAMPLE`(`BUCKET` x `OUT OF` y): Sample the table down to a `x` out of `y` fraction.
+
+Note: `TABLESAMPLE` returns the approximate number of rows or fraction requested.
+
+### Syntax
+
+{% highlight sql %}
+    TABLESAMPLE ((integer_expression | decimal_expression) PERCENT)
+        | TABLESAMPLE (integer_expression ROWS)
+        | TABLESAMPLE (BUCKET integer_expression OUT OF integer_expression)
+{% endhighlight %}
+
+### Examples
+
+{% highlight sql %}
+SELECT * FROM test;
+  +--+----+
+  |id|name|
+  +--+----+
+  | 5|Alex|
+  | 8|Lucy|
+  | 2|Mary|
+  | 4|Fred|
+  | 1|Lisa|
+  | 9|Eric|
+  |10|Adam|
+  | 6|Mark|
+  | 7|Lily|
+  | 3|Evan|
+  +--+----+
+
+SELECT * FROM test TABLESAMPLE (50 PERCENT);
+  +--+----+
+  |id|name|
+  +--+----+
+  | 5|Alex|
+  | 2|Mary|
+  | 4|Fred|
+  | 9|Eric|
+  |10|Adam|
+  | 3|Evan|
+  +--+----+
+
+SELECT * FROM test TABLESAMPLE (5 ROWS);
+  +--+----+
+  |id|name|
+  +--+----+
+  | 5|Alex|
+  | 8|Lucy|
+  | 2|Mary|
+  | 4|Fred|
+  | 1|Lisa|
+  +--+----+
+
+SELECT * FROM test TABLESAMPLE (BUCKET 4 OUT OF 10);
+  +--+----+
+  |id|name|
+  +--+----+
+  | 8|Lucy|
+  | 2|Mary|
+  | 9|Eric|
+  | 6|Mark|
+  +--+----+
+{% endhighlight %}
+
+### Related Statement
+
+  * [SELECT](sql-ref-syntax-qry-select.html)
\ No newline at end of file
diff --git a/docs/sql-ref-syntax-qry-select.md b/docs/sql-ref-syntax-qry-select.md
index 420cf1f..17c1411 100644
--- a/docs/sql-ref-syntax-qry-select.md
+++ b/docs/sql-ref-syntax-qry-select.md
@@ -150,4 +150,5 @@ SELECT [ hints , ... ] [ ALL | DISTINCT ] { named_expression [ , ... ] }
 - [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html)
 - [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html)
 - [LIMIT Clause](sql-ref-syntax-qry-select-limit.html)
+- [TABLESAMPLE](sql-ref-syntax-qry-sampling.html)
 - [SET Operators](sql-ref-syntax-qry-select-setops.html)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org