You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2019/02/02 00:34:30 UTC
[spark] branch master updated: [MINOR][DOC] Writing to partitioned
Hive metastore Parquet tables is not supported for Spark SQL
This is an automated email from the ASF dual-hosted git repository.
srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 421ff6f [MINOR][DOC] Writing to partitioned Hive metastore Parquet tables is not supported for Spark SQL
421ff6f is described below
commit 421ff6f60e2f3da123fd941d9fa91d7228b21ebc
Author: liuxian <li...@zte.com.cn>
AuthorDate: Fri Feb 1 18:34:13 2019 -0600
[MINOR][DOC] Writing to partitioned Hive metastore Parquet tables is not supported for Spark SQL
## What changes were proposed in this pull request?
Even if `spark.sql.hive.convertMetastoreParquet` is true, when writing to partitioned Hive metastore
Parquet tables, Spark SQL still can not use its own Parquet support instead of Hive SerDe.
Related code:
https://github.com/apache/spark/blob/d53e11ffce3f721886918c1cb4525478971f02bc/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala#L198
## How was this patch tested?
N/A
Closes #23671 from 10110346/parquetdoc.
Authored-by: liuxian <li...@zte.com.cn>
Signed-off-by: Sean Owen <se...@databricks.com>
---
docs/sql-data-sources-parquet.md | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/docs/sql-data-sources-parquet.md b/docs/sql-data-sources-parquet.md
index 5532bf9..f6e03fba 100644
--- a/docs/sql-data-sources-parquet.md
+++ b/docs/sql-data-sources-parquet.md
@@ -157,9 +157,10 @@ turned it off by default starting from 1.5.0. You may enable it by
### Hive metastore Parquet table conversion
-When reading from and writing to Hive metastore Parquet tables, Spark SQL will try to use its own
-Parquet support instead of Hive SerDe for better performance. This behavior is controlled by the
-`spark.sql.hive.convertMetastoreParquet` configuration, and is turned on by default.
+When reading from Hive metastore Parquet tables and writing to non-partitioned Hive metastore
+Parquet tables, Spark SQL will try to use its own Parquet support instead of Hive SerDe for
+better performance. This behavior is controlled by the `spark.sql.hive.convertMetastoreParquet`
+configuration, and is turned on by default.
#### Hive/Parquet Schema Reconciliation
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org