You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Lantao Jin (Jira)" <ji...@apache.org> on 2019/10/10 03:13:00 UTC

[jira] [Created] (SPARK-29421) Add an opportunity to change the file format of command CREATE TABLE LIKE

Lantao Jin created SPARK-29421:
----------------------------------

             Summary: Add an opportunity to change the file format of command CREATE TABLE LIKE
                 Key: SPARK-29421
                 URL: https://issues.apache.org/jira/browse/SPARK-29421
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Lantao Jin


Use CREATE TABLE tb1 LIKE tb2 command to create an empty table tb1 based on the definition of table tb2. The most user case is to create tb1 with the same schema of tb2. But an inconvenient case here is this command also copies the FileFormat from tb2, it cannot change the input/output format and serde. Add the ability of changing file format is useful for some scenarios like upgrading a table from a low performance file format to a high performance one (parquet, orc).

Here gives two options to enhance it.
Option1: Add a configuration {{spark.sql.createTableLike.fileformat}}, the value by default is "none" which keeps the behaviour same with current -- copying the file format from source table. After run command SET spark.sql.createTableLike.fileformat=parquet or any other valid file format defined in {{HiveSerDe}}, {{CREATE TABLE ... LIKE}} will use the new file format type.

Option2: Add syntax {{USING fileformat}} after {{CREATE TABLE ... LIKE}}. For example,
{code}
CREATE TABLE tb1 LIKE tb2 USING parquet;
{code}
If USING keyword is ignored, it also keeps the behaviour same with current -- copying the file format from source table.

We use option1 with parquet file format as an enhancement in our production thriftserver because we need change many existing SQL scripts without any modification. But for community, Option2 could be treated as a new feature since it needs user to write additional USING part.

cc [~dongjoon] [~hyukjin.kwon] [~joshrosen] [~cloud_fan] [~yumwang]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org