You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Cheng Lian (JIRA)" <ji...@apache.org> on 2016/04/08 18:18:25 UTC
[jira] [Comment Edited] (SPARK-14488) "CREATE TEMPORARY TABLE ...
USING ... AS SELECT ..." creates persisted table
[ https://issues.apache.org/jira/browse/SPARK-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232414#comment-15232414 ]
Cheng Lian edited comment on SPARK-14488 at 4/8/16 4:17 PM:
------------------------------------------------------------
Discussed with [~yhuai] offline, and here's the summary:
{{CreateTempTableUsingAsSelect}} existed since 1.3 (I'm surprised that I never noticed it!). Its semantics is:
# Execute the {{SELECT}} query.
# Store query result to a user specified position in filesystem. Note that this means the {{PATH}} data source option should always be set when using this DDL command.
# Create a temporary table using written files.
Basically, it can be used to dump query results to the filesystem without creating persisted tables. It's indeed a confusing command and is kinda equivalent to the following DDL sequence:
- {{INSERT OVERWRITE DIRECTORY ... STORE AS ... SELECT ...}}
- {{CREATE TEMPORARY TABLE ... USING ... OPTION (PATH ...)}}
However, Spark hasn't implemented {{INSERT OVERWRITE DIRECTORY}} yet. In the long run, we should implement it and deprecate this confusing DDL command.
Ticket title and description were updated accordingly.
was (Author: lian cheng):
Discussed with [~yhuai] offline, and here's the summary:
{{CreateTempTableUsingAsSelect}} existed since 1.3 (I'm surprised that I never noticed it!). Its semantics is:
# Execute the {{SELECT}} query.
# Store query result to a user specified position in filesystem. Note that this means the {{PATH}} data source option should always be set when using this DDL command.
# Create a temporary table using written files.
Basically, it can be used to dump query results to the filesystem without creating persisted tables. It's indeed a confusing and is kinda equivalent to the following DDL sequence:
- {{INSERT OVERWRITE DIRECTORY ... STORE AS ... SELECT ...}}
- {{CREATE TEMPORARY TABLE ... USING ... OPTION (PATH ...)}}
However, Spark hasn't implemented {{INSERT OVERWRITE DIRECTORY}} yet. In the long run, we should implement it and deprecate this confusing DDL command.
Ticket title and description were updated accordingly.
> "CREATE TEMPORARY TABLE ... USING ... AS SELECT ..." creates persisted table
> ----------------------------------------------------------------------------
>
> Key: SPARK-14488
> URL: https://issues.apache.org/jira/browse/SPARK-14488
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.0.0
> Reporter: Cheng Lian
> Assignee: Cheng Lian
>
> The following Spark shell snippet reproduces this bug:
> {code}
> sqlContext range 10 registerTempTable "x"
> // The problematic DDL statement:
> sqlContext sql "CREATE TEMPORARY TABLE y USING PARQUET AS SELECT * FROM x"
> sqlContext.tables().show()
> {code}
> It shows the following result:
> {noformat}
> +---------+-----------+
> |tableName|isTemporary|
> +---------+-----------+
> | y| false|
> | x| true|
> +---------+-----------+
> {noformat}
> Note that {{y}} is NOT temporary although it's created using {{CREATE TEMPORARY TABLE ...}}.
> Explain shows that the physical plan node is {{CreateTableUsingAsSelect}} rather than {{CreateTempTableUsingAsSelect}}.
> {noformat}
> == Parsed Logical Plan ==
> 'CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, None, Overwrite, Map()
> +- 'Project [*]
> +- 'UnresolvedRelation `x`, None
> == Analyzed Logical Plan ==
> CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, None, Overwrite, Map()
> +- Project [id#0L]
> +- SubqueryAlias x
> +- Range 0, 10, 1, 1, [id#0L]
> == Optimized Logical Plan ==
> CreateTableUsingAsSelect `y`, PARQUET, true, [Ljava.lang.String;@4d001a14, None, Overwrite, Map()
> +- Range 0, 10, 1, 1, [id#0L]
> == Physical Plan ==
> ExecutedCommand CreateMetastoreDataSourceAsSelect `y`, PARQUET, [Ljava.lang.String;@4d001a14, None, Overwrite, Map(), Range 0, 10, 1, 1, [id#0L]|
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org