You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (Jira)" <ji...@apache.org> on 2020/02/01 04:06:00 UTC

[jira] [Updated] (SPARK-27946) Hive DDL to Spark DDL conversion USING "show create table"

     [ https://issues.apache.org/jira/browse/SPARK-27946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiao Li updated SPARK-27946:
----------------------------
    Description: 
This patch adds a DDL command SHOW CREATE TABLE AS SERDE. It is used to generate Hive DDL for a Hive table.

For original SHOW CREATE TABLE, it now shows Spark DDL always. If given a Hive table, it tries to generate Spark DDL.

For Hive serde to data source conversion, this uses the existing mapping inside HiveSerDe. If can't find a mapping there, throws an analysis exception on unsupported serde configuration.

It is arguably that some Hive fileformat + row serde might be mapped to Spark data source, e.g., CSV. It is not included in this PR. To be conservative, it may not be supported.

For Hive serde properties, for now this doesn't save it to Spark DDL because it may not useful to keep Hive serde properties in Spark table.

  was:Many users migrate tables created with Hive DDL to Spark. Defining the table with Spark DDL brings performance benefits. We need to add a feature to Show Create Table that allows you to generate Spark DDL for a table. For example: `SHOW CREATE TABLE customersĀ ASĀ SPARK`.


> Hive DDL to Spark DDL conversion USING "show create table"
> ----------------------------------------------------------
>
>                 Key: SPARK-27946
>                 URL: https://issues.apache.org/jira/browse/SPARK-27946
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Xiao Li
>            Priority: Major
>
> This patch adds a DDL command SHOW CREATE TABLE AS SERDE. It is used to generate Hive DDL for a Hive table.
> For original SHOW CREATE TABLE, it now shows Spark DDL always. If given a Hive table, it tries to generate Spark DDL.
> For Hive serde to data source conversion, this uses the existing mapping inside HiveSerDe. If can't find a mapping there, throws an analysis exception on unsupported serde configuration.
> It is arguably that some Hive fileformat + row serde might be mapped to Spark data source, e.g., CSV. It is not included in this PR. To be conservative, it may not be supported.
> For Hive serde properties, for now this doesn't save it to Spark DDL because it may not useful to keep Hive serde properties in Spark table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org