You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/27 14:03:07 UTC
[GitHub] [spark] GuoPhilipse opened a new pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
GuoPhilipse opened a new pull request #29883:
URL: https://github.com/apache/spark/pull/29883
### What changes were proposed in this pull request?
update sql-ref docs, the following key words will be added in this PR.
CLUSTERED BY
SORTED BY
INTO num_buckets BUCKETS
### Why are the changes needed?
let more users know the sql key words usage
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
No
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699749269
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] maropu commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699769725
Could you add the screenshots of the updated docs in the PR description? LGTM otherwise.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699956479
> Could you add the screenshots of the updated docs in the PR description? LGTM otherwise.
@maropu Have updated the html screenshots :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699709265
**[Test build #129159 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129159/testReport)** for PR 29883 at commit [`5a8cba7`](https://github.com/apache/spark/commit/5a8cba720010759369e1d6c82a0711d067cb54c7).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700427295
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33823/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] maropu commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699706787
ok to test
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699749263
Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33776/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] huaxingao commented on a change in pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
huaxingao commented on a change in pull request #29883:
URL: https://github.com/apache/spark/pull/29883#discussion_r496071408
##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
[ COMMENT table_comment ]
[ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... )
| ( col_name1, col_name2, ... ) ]
+ [ CLUSTERED BY ( col_name1, col_name2, ...)
Review comment:
super nit: add a space between `...` and `)`
##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE clustered_by_test1 (ID INT, AGE STRING)
+ CLUSTERED BY (ID)
+ INTO 4 BUCKETS
+ STORED AS ORC
+
+--Use `CLUSTERED BY` clause to create bucket table with `SORTED BY`
+CREATE TABLE clustered_by_test2 (ID INT, NAME STRING)
+ PARTITIONED BY (YEAR STRING)
+ CLUSTERED BY (ID,NAME)
Review comment:
super nit: add a space after `,`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699739668
**[Test build #129161 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129161/testReport)** for PR 29883 at commit [`ceba14b`](https://github.com/apache/spark/commit/ceba14ba348be2bc0a1fce1fcbac0d9cc2500982).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on a change in pull request #29883:
URL: https://github.com/apache/spark/pull/29883#discussion_r495656172
##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -65,6 +68,18 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
Partitions are created on the table, based on the columns specified.
+* **CLUSTERED BY**
+
+ Specifies bucket columns for bucketing table.
Review comment:
emm, sorry, i miss that, let me keep pace with the one in `sql-ref-syntax-ddl-create-table-datasource`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699639817
Can one of the admins verify this patch?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700411801
**[Test build #129208 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129208/testReport)** for PR 29883 at commit [`0f76d27`](https://github.com/apache/spark/commit/0f76d278a79d30003c2513406b532460d195d07b).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699749269
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699720057
Merged build finished. Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] maropu closed pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
maropu closed pull request #29883:
URL: https://github.com/apache/spark/pull/29883
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700414097
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699737550
**[Test build #129161 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129161/testReport)** for PR 29883 at commit [`ceba14b`](https://github.com/apache/spark/commit/ceba14ba348be2bc0a1fce1fcbac0d9cc2500982).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699709306
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700411801
**[Test build #129208 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129208/testReport)** for PR 29883 at commit [`0f76d27`](https://github.com/apache/spark/commit/0f76d278a79d30003c2513406b532460d195d07b).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700407439
Thanks@maropu @huaxingao ,have updated :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699715383
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33774/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699639662
Can one of the admins verify this patch?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699739730
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699639817
Can one of the admins verify this patch?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699739730
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699720061
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/33774/
Test FAILed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699720057
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29883:
URL: https://github.com/apache/spark/pull/29883#discussion_r495635757
##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -65,6 +68,18 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
Partitions are created on the table, based on the columns specified.
+* **CLUSTERED BY**
+
+ Specifies bucket columns for bucketing table.
+
+* **SORTED BY**
+
+ Used to sort bucket column, we can combine with `ASC` for ascending order, with `DESC` for descending order.
+
+* **INTO num_buckets BUCKETS**
+
+ Specifies buckets numbers, which is used in `CLUSTERED BY` clause.
Review comment:
nit: redundant spaces found between `in`/`CLUSTER BY`. And, `in the CLUSTERED BY clause`?
##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -65,6 +68,18 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
Partitions are created on the table, based on the columns specified.
+* **CLUSTERED BY**
+
+ Specifies bucket columns for bucketing table.
Review comment:
`Specifies bucket column names for bucketing a table`?
##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -203,6 +218,17 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE TEST1(ID INT, AGE STRING)
+ CLUSTERED BY (ID)
+ INTO 4 BUCKETS
+
+--Use `CLUSTERED BY` clause to create bucket table with `SORTED BY`
+CREATE TABLE TEST2(ID INT, NAME STRING)
Review comment:
ditto
##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -203,6 +218,17 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE TEST1(ID INT, AGE STRING)
Review comment:
nit: To follow the format of the other examples, `CREATE TABLE clustered_by_test1 (ID ...`?
##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -65,6 +68,18 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
Partitions are created on the table, based on the columns specified.
+* **CLUSTERED BY**
+
+ Specifies bucket columns for bucketing table.
+
+* **SORTED BY**
+
+ Used to sort bucket column, we can combine with `ASC` for ascending order, with `DESC` for descending order.
Review comment:
How about rephrasing it like this? `Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC for a descending order after any column names in the SORTED BY clause. If not specified, ASC is assumed by default.`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699707745
**[Test build #129159 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129159/testReport)** for PR 29883 at commit [`5a8cba7`](https://github.com/apache/spark/commit/5a8cba720010759369e1d6c82a0711d067cb54c7).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699707745
**[Test build #129159 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129159/testReport)** for PR 29883 at commit [`5a8cba7`](https://github.com/apache/spark/commit/5a8cba720010759369e1d6c82a0711d067cb54c7).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699709306
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699737550
**[Test build #129161 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129161/testReport)** for PR 29883 at commit [`ceba14b`](https://github.com/apache/spark/commit/ceba14ba348be2bc0a1fce1fcbac0d9cc2500982).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699639662
Can one of the admins verify this patch?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699720055
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33774/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700422983
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33823/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] maropu commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-701695585
Thanks! Merged to master/3.0.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699745171
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33776/
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700427613
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] huaxingao commented on a change in pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
huaxingao commented on a change in pull request #29883:
URL: https://github.com/apache/spark/pull/29883#discussion_r495643293
##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -65,6 +68,18 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
Partitions are created on the table, based on the columns specified.
+* **CLUSTERED BY**
+
+ Specifies bucket columns for bucketing table.
Review comment:
Should we use the same description as the one in ```sql-ref-syntax-ddl-create-table-datasource```?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700413988
**[Test build #129208 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129208/testReport)** for PR 29883 at commit [`0f76d27`](https://github.com/apache/spark/commit/0f76d278a79d30003c2513406b532460d195d07b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700427613
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700414097
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org