You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by we...@apache.org on 2021/08/19 03:05:30 UTC
[spark] branch branch-3.2 updated:
[SPARK-33687][SQL][DOC][FOLLOWUP] Merge the doc pages of ANALYZE TABLE and
ANALYZE TABLES
This is an automated email from the ASF dual-hosted git repository.
wenchen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.2 by this push:
new 8f3b4c4 [SPARK-33687][SQL][DOC][FOLLOWUP] Merge the doc pages of ANALYZE TABLE and ANALYZE TABLES
8f3b4c4 is described below
commit 8f3b4c4b7d717c5cfc922ce160a1da42303d5304
Author: Wenchen Fan <we...@databricks.com>
AuthorDate: Thu Aug 19 11:04:05 2021 +0800
[SPARK-33687][SQL][DOC][FOLLOWUP] Merge the doc pages of ANALYZE TABLE and ANALYZE TABLES
### What changes were proposed in this pull request?
This is a followup of https://github.com/apache/spark/pull/30648
ANALYZE TABLE and TABLES are essentially the same command, it's weird to put them in 2 different doc pages. This PR proposes to merge them into one doc page.
### Why are the changes needed?
simplify the doc
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
N/A
Closes #33781 from cloud-fan/doc.
Authored-by: Wenchen Fan <we...@databricks.com>
Signed-off-by: Wenchen Fan <we...@databricks.com>
(cherry picked from commit 07d173a8b0a19a2912905387bcda10e9db3c43c6)
Signed-off-by: Wenchen Fan <we...@databricks.com>
---
docs/sql-ref-syntax-aux-analyze-table.md | 85 +++++++++++++++++++----
docs/sql-ref-syntax-aux-analyze-tables.md | 110 ------------------------------
docs/sql-ref-syntax-aux-analyze.md | 23 -------
docs/sql-ref-syntax.md | 1 -
4 files changed, 70 insertions(+), 149 deletions(-)
diff --git a/docs/sql-ref-syntax-aux-analyze-table.md b/docs/sql-ref-syntax-aux-analyze-table.md
index da53385..0e65de1 100644
--- a/docs/sql-ref-syntax-aux-analyze-table.md
+++ b/docs/sql-ref-syntax-aux-analyze-table.md
@@ -21,7 +21,8 @@ license: |
### Description
-The `ANALYZE TABLE` statement collects statistics about the table to be used by the query optimizer to find a better query execution plan.
+The `ANALYZE TABLE` statement collects statistics about one specific table or all the tables in one specified database,
+that are to be used by the query optimizer to find a better query execution plan.
### Syntax
@@ -30,6 +31,10 @@ ANALYZE TABLE table_identifier [ partition_spec ]
COMPUTE STATISTICS [ NOSCAN | FOR COLUMNS col [ , ... ] | FOR ALL COLUMNS ]
```
+```sql
+ANALYZE TABLES [ { FROM | IN } database_name ] COMPUTE STATISTICS [ NOSCAN ]
+```
+
### Parameters
* **table_identifier**
@@ -45,22 +50,31 @@ ANALYZE TABLE table_identifier [ partition_spec ]
**Syntax:** `PARTITION ( partition_col_name [ = partition_col_val ] [ , ... ] )`
-* **[ NOSCAN `|` FOR COLUMNS col [ , ... ] `|` FOR ALL COLUMNS ]**
+* **{ FROM `|` IN } database_name**
+
+ Specifies the name of the database to be analyzed. Without a database name, `ANALYZE` collects all tables in the current database that the current user has permission to analyze.
+
+* **NOSCAN**
+
+ Collects only the table's size in bytes (which does not require scanning the entire table).
- * If no analyze option is specified, `ANALYZE TABLE` collects the table's number of rows and size in bytes.
- * **NOSCAN**
+* **FOR COLUMNS col [ , ... ] `|` FOR ALL COLUMNS**
- Collects only the table's size in bytes (which does not require scanning the entire table).
- * **FOR COLUMNS col [ , ... ] `|` FOR ALL COLUMNS**
+ Collects column statistics for each column specified, or alternatively for every column, as well as table statistics.
- Collects column statistics for each column specified, or alternatively for every column, as well as table statistics.
+If no analyze option is specified, both number of rows and size in bytes are collected.
### Examples
```sql
+CREATE DATABASE school_db;
+USE school_db;
+
+CREATE TABLE teachers (name STRING, teacher_id INT);
+INSERT INTO teachers VALUES ('Tom', 1), ('Jerry', 2);
+
CREATE TABLE students (name STRING, student_id INT) PARTITIONED BY (student_id);
-INSERT INTO students PARTITION (student_id = 111111) VALUES ('Mark');
-INSERT INTO students PARTITION (student_id = 222222) VALUES ('John');
+INSERT INTO students VALUES ('Mark', 111111), ('John', 222222);
ANALYZE TABLE students COMPUTE STATISTICS NOSCAN;
@@ -73,7 +87,6 @@ DESC EXTENDED students;
| ...| ...| ...|
| Statistics| 864 bytes| |
| ...| ...| ...|
-| Partition Provider| Catalog| |
+--------------------+--------------------+-------+
ANALYZE TABLE students COMPUTE STATISTICS;
@@ -87,7 +100,6 @@ DESC EXTENDED students;
| ...| ...| ...|
| Statistics| 864 bytes, 2 rows| |
| ...| ...| ...|
-| Partition Provider| Catalog| |
+--------------------+--------------------+-------+
ANALYZE TABLE students PARTITION (student_id = 111111) COMPUTE STATISTICS;
@@ -101,7 +113,6 @@ DESC EXTENDED students PARTITION (student_id = 111111);
| ...| ...| ...|
|Partition Statistics| 432 bytes, 1 rows| |
| ...| ...| ...|
-| OutputFormat|org.apache.hadoop...| |
+--------------------+--------------------+-------+
ANALYZE TABLE students COMPUTE STATISTICS FOR COLUMNS name;
@@ -121,8 +132,52 @@ DESC EXTENDED students name;
| max_col_len| 4|
| histogram| NULL|
+--------------+----------+
-```
-### Related Statements
+ANALYZE TABLES IN school_db COMPUTE STATISTICS NOSCAN;
+
+DESC EXTENDED teachers;
++--------------------+--------------------+-------+
+| col_name| data_type|comment|
++--------------------+--------------------+-------+
+| name| string| null|
+| teacher_id| int| null|
+| ...| ...| ...|
+| Statistics| 1382 bytes| |
+| ...| ...| ...|
++--------------------+--------------------+-------+
+
+DESC EXTENDED students;
++--------------------+--------------------+-------+
+| col_name| data_type|comment|
++--------------------+--------------------+-------+
+| name| string| null|
+| student_id| int| null|
+| ...| ...| ...|
+| Statistics| 864 bytes| |
+| ...| ...| ...|
++--------------------+--------------------+-------+
+
+ANALYZE TABLES COMPUTE STATISTICS;
+
+DESC EXTENDED teachers;
++--------------------+--------------------+-------+
+| col_name| data_type|comment|
++--------------------+--------------------+-------+
+| name| string| null|
+| teacher_id| int| null|
+| ...| ...| ...|
+| Statistics| 1382 bytes, 2 rows| |
+| ...| ...| ...|
++--------------------+--------------------+-------+
-* [ANALYZE TABLES](sql-ref-syntax-aux-analyze-tables.html)
+DESC EXTENDED students;
++--------------------+--------------------+-------+
+| col_name| data_type|comment|
++--------------------+--------------------+-------+
+| name| string| null|
+| student_id| int| null|
+| ...| ...| ...|
+| Statistics| 864 bytes, 2 rows| |
+| ...| ...| ...|
++--------------------+--------------------+-------+
+```
diff --git a/docs/sql-ref-syntax-aux-analyze-tables.md b/docs/sql-ref-syntax-aux-analyze-tables.md
deleted file mode 100644
index f70cfa4..0000000
--- a/docs/sql-ref-syntax-aux-analyze-tables.md
+++ /dev/null
@@ -1,110 +0,0 @@
----
-layout: global
-title: ANALYZE TABLES
-displayTitle: ANALYZE TABLES
-license: |
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
----
-
-### Description
-
-The `ANALYZE TABLES` statement collects statistics about all the tables in a specified database to be used by the query optimizer to find a better query execution plan.
-
-### Syntax
-
-```sql
-ANALYZE TABLES [ { FROM | IN } database_name ] COMPUTE STATISTICS [ NOSCAN ]
-```
-
-### Parameters
-
-* **{ FROM `|` IN } database_name**
-
- Specifies the name of the database to be analyzed. Without a database name, `ANALYZE` collects all tables in the current database that the current user has permission to analyze.
-
-* **[ NOSCAN ]**
-
- Collects only the table's size in bytes (which does not require scanning the entire table).
-
-### Examples
-
-```sql
-CREATE DATABASE school_db;
-USE school_db;
-
-CREATE TABLE teachers (name STRING, teacher_id INT);
-INSERT INTO teachers VALUES ('Tom', 1), ('Jerry', 2);
-
-CREATE TABLE students (name STRING, student_id INT, age SHORT);
-INSERT INTO students VALUES ('Mark', 111111, 10), ('John', 222222, 11);
-
-ANALYZE TABLES IN school_db COMPUTE STATISTICS NOSCAN;
-
-DESC EXTENDED teachers;
-+--------------------+--------------------+-------+
-| col_name| data_type|comment|
-+--------------------+--------------------+-------+
-| name| string| null|
-| teacher_id| int| null|
-| ...| ...| ...|
-| Provider| parquet| |
-| Statistics| 1382 bytes| |
-| ...| ...| ...|
-+--------------------+--------------------+-------+
-
-DESC EXTENDED students;
-+--------------------+--------------------+-------+
-| col_name| data_type|comment|
-+--------------------+--------------------+-------+
-| name| string| null|
-| student_id| int| null|
-| age| smallint| null|
-| ...| ...| ...|
-| Statistics| 1828 bytes| |
-| ...| ...| ...|
-+--------------------+--------------------+-------+
-
-ANALYZE TABLES COMPUTE STATISTICS;
-
-DESC EXTENDED teachers;
-+--------------------+--------------------+-------+
-| col_name| data_type|comment|
-+--------------------+--------------------+-------+
-| name| string| null|
-| teacher_id| int| null|
-| ...| ...| ...|
-| Provider| parquet| |
-| Statistics| 1382 bytes, 2 rows| |
-| ...| ...| ...|
-+--------------------+--------------------+-------+
-
-DESC EXTENDED students;
-+--------------------+--------------------+-------+
-| col_name| data_type|comment|
-+--------------------+--------------------+-------+
-| name| string| null|
-| student_id| int| null|
-| age| smallint| null|
-| ...| ...| ...|
-| Provider| parquet| |
-| Statistics| 1828 bytes, 2 rows| |
-| ...| ...| ...|
-+--------------------+--------------------+-------+
-```
-
-### Related Statements
-
-* [ANALYZE TABLE](sql-ref-syntax-aux-analyze-table.html)
diff --git a/docs/sql-ref-syntax-aux-analyze.md b/docs/sql-ref-syntax-aux-analyze.md
deleted file mode 100644
index 7808966..0000000
--- a/docs/sql-ref-syntax-aux-analyze.md
+++ /dev/null
@@ -1,23 +0,0 @@
----
-layout: global
-title: Analyze Statement
-displayTitle: Analyze Statement
-license: |
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
----
-
- * [ANALYZE TABLE statement](sql-ref-syntax-aux-analyze-table.html)
- * [ANALYZE TABLES statement](sql-ref-syntax-aux-analyze-tables.html)
diff --git a/docs/sql-ref-syntax.md b/docs/sql-ref-syntax.md
index cb7a04d..2165ea3 100644
--- a/docs/sql-ref-syntax.md
+++ b/docs/sql-ref-syntax.md
@@ -90,7 +90,6 @@ ability to generate logical and physical plan for a given query using
* [ADD FILE](sql-ref-syntax-aux-resource-mgmt-add-file.html)
* [ADD JAR](sql-ref-syntax-aux-resource-mgmt-add-jar.html)
* [ANALYZE TABLE](sql-ref-syntax-aux-analyze-table.html)
- * [ANALYZE TABLES](sql-ref-syntax-aux-analyze-tables.html)
* [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html)
* [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html)
* [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org