You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2019/02/02 00:34:30 UTC

[spark] branch master updated: [MINOR][DOC] Writing to partitioned Hive metastore Parquet tables is not supported for Spark SQL

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 421ff6f  [MINOR][DOC] Writing to partitioned Hive metastore Parquet tables is not supported for Spark SQL
421ff6f is described below

commit 421ff6f60e2f3da123fd941d9fa91d7228b21ebc
Author: liuxian <li...@zte.com.cn>
AuthorDate: Fri Feb 1 18:34:13 2019 -0600

    [MINOR][DOC] Writing to partitioned Hive metastore Parquet tables is not supported for Spark SQL
    
    ## What changes were proposed in this pull request?
    
    Even if `spark.sql.hive.convertMetastoreParquet` is true,  when writing to partitioned Hive metastore
    Parquet tables,  Spark SQL still  can not use its own Parquet support instead of Hive SerDe.
    
    Related code:
     https://github.com/apache/spark/blob/d53e11ffce3f721886918c1cb4525478971f02bc/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala#L198
    ## How was this patch tested?
    N/A
    
    Closes #23671 from 10110346/parquetdoc.
    
    Authored-by: liuxian <li...@zte.com.cn>
    Signed-off-by: Sean Owen <se...@databricks.com>
---
 docs/sql-data-sources-parquet.md | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/docs/sql-data-sources-parquet.md b/docs/sql-data-sources-parquet.md
index 5532bf9..f6e03fba 100644
--- a/docs/sql-data-sources-parquet.md
+++ b/docs/sql-data-sources-parquet.md
@@ -157,9 +157,10 @@ turned it off by default starting from 1.5.0. You may enable it by
 
 ### Hive metastore Parquet table conversion
 
-When reading from and writing to Hive metastore Parquet tables, Spark SQL will try to use its own
-Parquet support instead of Hive SerDe for better performance. This behavior is controlled by the
-`spark.sql.hive.convertMetastoreParquet` configuration, and is turned on by default.
+When reading from Hive metastore Parquet tables and writing to non-partitioned Hive metastore
+Parquet tables, Spark SQL will try to use its own Parquet support instead of Hive SerDe for
+better performance. This behavior is controlled by the `spark.sql.hive.convertMetastoreParquet`
+configuration, and is turned on by default.
 
 #### Hive/Parquet Schema Reconciliation
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org