You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2017/08/20 03:53:00 UTC

[jira] [Commented] (SPARK-21769) Add a table option for Hive-serde tables to make Spark always respect schemas inferred by Spark SQL

    [ https://issues.apache.org/jira/browse/SPARK-21769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16134296#comment-16134296 ] 

Xiao Li commented on SPARK-21769:
---------------------------------

https://github.com/apache/spark/pull/19003

> Add a table option for Hive-serde tables to make Spark always respect schemas inferred by Spark SQL
> ---------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21769
>                 URL: https://issues.apache.org/jira/browse/SPARK-21769
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Xiao Li
>            Assignee: Xiao Li
>
> For Hive-serde tables, we always respect the schema stored in Hive metastore, because the schema could be altered by the other engines that share the same metastore. Thus, we always trust the metastore-controlled schema for Hive-serde tables when the schemas are different (without considering the nullability and cases). However, in some scenarios, Hive metastore also could INCORRECTLY overwrite the schemas when the serde and Hive metastore built-in serde are different. 
> The proposed solution is to introduce a table property for such scenarios. For a specific Hive-serde table, users can manually setting such table property for asking Spark for always respect Spark-inferred schema instead of trusting metastore-controlled schema. By default, it is off. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org