You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/27 14:03:07 UTC

[GitHub] [spark] GuoPhilipse opened a new pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

GuoPhilipse opened a new pull request #29883:
URL: https://github.com/apache/spark/pull/29883


   
   ### What changes were proposed in this pull request?
   update sql-ref docs, the following key words will be added in this PR.
   
   CLUSTERED BY 
   SORTED BY
   INTO num_buckets BUCKETS
   
   
   ### Why are the changes needed?
   let more users know the sql key words usage
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   No
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699749269






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699769725


   Could you add the screenshots of the updated docs in the PR description? LGTM otherwise.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699956479


   > Could you add the screenshots of the updated docs in the PR description? LGTM otherwise.
   @maropu Have updated the html screenshots :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699709265


   **[Test build #129159 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129159/testReport)** for PR 29883 at commit [`5a8cba7`](https://github.com/apache/spark/commit/5a8cba720010759369e1d6c82a0711d067cb54c7).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700427295


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33823/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699706787


   ok to test


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699749263


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33776/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huaxingao commented on a change in pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
huaxingao commented on a change in pull request #29883:
URL: https://github.com/apache/spark/pull/29883#discussion_r496071408



##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -31,6 +31,9 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
     [ COMMENT table_comment ]
     [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
         | ( col_name1, col_name2, ... ) ]
+    [ CLUSTERED BY ( col_name1, col_name2, ...) 

Review comment:
       super nit: add a space between `...` and `)`

##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -203,6 +221,20 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
     STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
         OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
     LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE clustered_by_test1 (ID INT, AGE STRING)
+    CLUSTERED BY (ID)
+    INTO 4 BUCKETS
+    STORED AS ORC
+
+--Use `CLUSTERED BY` clause to create bucket table with `SORTED BY`
+CREATE TABLE clustered_by_test2 (ID INT, NAME STRING)
+    PARTITIONED BY (YEAR STRING)
+    CLUSTERED BY (ID,NAME)

Review comment:
       super nit: add a space after `,`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699739668


   **[Test build #129161 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129161/testReport)** for PR 29883 at commit [`ceba14b`](https://github.com/apache/spark/commit/ceba14ba348be2bc0a1fce1fcbac0d9cc2500982).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse commented on a change in pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on a change in pull request #29883:
URL: https://github.com/apache/spark/pull/29883#discussion_r495656172



##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -65,6 +68,18 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
 
     Partitions are created on the table, based on the columns specified.
     
+* **CLUSTERED BY**
+
+    Specifies bucket columns for bucketing table.

Review comment:
       emm, sorry,  i miss that,  let me keep pace with the one in `sql-ref-syntax-ddl-create-table-datasource`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699639817


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700411801


   **[Test build #129208 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129208/testReport)** for PR 29883 at commit [`0f76d27`](https://github.com/apache/spark/commit/0f76d278a79d30003c2513406b532460d195d07b).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699749269






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699720057


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu closed pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
maropu closed pull request #29883:
URL: https://github.com/apache/spark/pull/29883


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700414097






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699737550


   **[Test build #129161 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129161/testReport)** for PR 29883 at commit [`ceba14b`](https://github.com/apache/spark/commit/ceba14ba348be2bc0a1fce1fcbac0d9cc2500982).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699709306






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700411801


   **[Test build #129208 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129208/testReport)** for PR 29883 at commit [`0f76d27`](https://github.com/apache/spark/commit/0f76d278a79d30003c2513406b532460d195d07b).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700407439


   Thanks@maropu @huaxingao ,have updated :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699715383


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33774/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699639662


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699739730






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699639817


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699739730






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699720061


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/33774/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699720057






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29883:
URL: https://github.com/apache/spark/pull/29883#discussion_r495635757



##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -65,6 +68,18 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
 
     Partitions are created on the table, based on the columns specified.
     
+* **CLUSTERED BY**
+
+    Specifies bucket columns for bucketing table.
+    
+* **SORTED BY**
+
+    Used to sort bucket column, we can combine with `ASC` for ascending order, with `DESC` for descending order.
+    
+* **INTO num_buckets BUCKETS**
+
+    Specifies buckets numbers, which is used in  `CLUSTERED BY` clause.

Review comment:
       nit: redundant spaces found between `in`/`CLUSTER BY`. And, `in the CLUSTERED BY clause`?

##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -65,6 +68,18 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
 
     Partitions are created on the table, based on the columns specified.
     
+* **CLUSTERED BY**
+
+    Specifies bucket columns for bucketing table.

Review comment:
       `Specifies bucket column names for bucketing a table`?

##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -203,6 +218,17 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
     STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
         OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
     LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE TEST1(ID INT, AGE STRING)
+    CLUSTERED BY (ID)
+    INTO 4 BUCKETS
+
+--Use `CLUSTERED BY` clause to create bucket table with `SORTED BY`
+CREATE TABLE TEST2(ID INT, NAME STRING)

Review comment:
       ditto

##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -203,6 +218,17 @@ CREATE EXTERNAL TABLE family (id INT, name STRING)
     STORED AS INPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleInputFormat'
         OUTPUTFORMAT 'com.ly.spark.example.serde.io.SerDeExampleOutputFormat'
     LOCATION '/tmp/family/';
+
+--Use `CLUSTERED BY` clause to create bucket table without `SORTED BY`
+CREATE TABLE TEST1(ID INT, AGE STRING)

Review comment:
       nit: To follow the format of the other examples, `CREATE TABLE clustered_by_test1 (ID ...`?

##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -65,6 +68,18 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
 
     Partitions are created on the table, based on the columns specified.
     
+* **CLUSTERED BY**
+
+    Specifies bucket columns for bucketing table.
+    
+* **SORTED BY**
+
+    Used to sort bucket column, we can combine with `ASC` for ascending order, with `DESC` for descending order.

Review comment:
       How about rephrasing it like this? `Specifies an ordering of bucket columns. Optionally, one can use ASC for an ascending order or DESC  for a descending order after any column names in the SORTED BY clause. If not specified, ASC is assumed by default.`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699707745


   **[Test build #129159 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129159/testReport)** for PR 29883 at commit [`5a8cba7`](https://github.com/apache/spark/commit/5a8cba720010759369e1d6c82a0711d067cb54c7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699707745


   **[Test build #129159 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129159/testReport)** for PR 29883 at commit [`5a8cba7`](https://github.com/apache/spark/commit/5a8cba720010759369e1d6c82a0711d067cb54c7).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699709306






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699737550


   **[Test build #129161 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129161/testReport)** for PR 29883 at commit [`ceba14b`](https://github.com/apache/spark/commit/ceba14ba348be2bc0a1fce1fcbac0d9cc2500982).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699639662


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699720055


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33774/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700422983


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33823/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-701695585


   Thanks! Merged to master/3.0.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-699745171


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33776/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700427613






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] huaxingao commented on a change in pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
huaxingao commented on a change in pull request #29883:
URL: https://github.com/apache/spark/pull/29883#discussion_r495643293



##########
File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md
##########
@@ -65,6 +68,18 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
 
     Partitions are created on the table, based on the columns specified.
     
+* **CLUSTERED BY**
+
+    Specifies bucket columns for bucketing table.

Review comment:
       Should we use the same description as the one in ```sql-ref-syntax-ddl-create-table-datasource```?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700413988


   **[Test build #129208 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129208/testReport)** for PR 29883 at commit [`0f76d27`](https://github.com/apache/spark/commit/0f76d278a79d30003c2513406b532460d195d07b).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700427613






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29883: [SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29883:
URL: https://github.com/apache/spark/pull/29883#issuecomment-700414097






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org