You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2017/08/17 22:57:00 UTC

[jira] [Created] (SPARK-21769) Add a table property for Hive-serde tables to controlling Spark always respecting schemas inferred by Spark SQL

Xiao Li created SPARK-21769:
-------------------------------

             Summary: Add a table property for Hive-serde tables to controlling Spark always respecting schemas inferred by Spark SQL
                 Key: SPARK-21769
                 URL: https://issues.apache.org/jira/browse/SPARK-21769
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.3.0
            Reporter: Xiao Li
            Assignee: Xiao Li


For Hive-serde tables, we always respect the schema stored in Hive metastore, because the schema could be altered by the other engines that share the same metastore. Thus, we always trust the metastore-controlled schema for Hive-serde tables when the schemas are different (without considering the nullability and cases). However, in some scenarios, Hive metastore also could INCORRECTLY overwrite the schemas when the serde and Hive metastore built-in serde are different. 

The proposed solution is to introduce a table property for such scenarios. For a specific Hive-serde table, users can manually setting such table property for asking Spark for always respect Spark-inferred schema instead of trusting metastore-controlled schema. By default, it is off. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org