You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2019/10/23 07:26:00 UTC

[jira] [Resolved] (SPARK-29324) saveAsTable with overwrite mode results in metadata loss

     [ https://issues.apache.org/jira/browse/SPARK-29324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-29324.
----------------------------------
    Resolution: Not A Problem

> saveAsTable with overwrite mode results in metadata loss
> --------------------------------------------------------
>
>                 Key: SPARK-29324
>                 URL: https://issues.apache.org/jira/browse/SPARK-29324
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Karuppayya
>            Priority: Major
>
> {code:java}
> scala> spark.range(1).write.option("path", "file:///tmp/tbl").format("orc").saveAsTable("tbl")
> scala> spark.sql("desc extended tbl").collect.foreach(println)
> [id,bigint,null]
> [,,]
> [# Detailed Table Information,,]
> [Database,default,]
> [Table,tbl,]
> [Owner,karuppayyar,]
> [Created Time,Wed Oct 02 09:29:06 IST 2019,]
> [Last Access,UNKNOWN,]
> [Created By,Spark 3.0.0-SNAPSHOT,]
> [Type,EXTERNAL,]
> [Provider,orc,]
> [Location,file:/tmp/tbl_loc,]
> [Serde Library,org.apache.hadoop.hive.ql.io.orc.OrcSerde,]
> [InputFormat,org.apache.hadoop.hive.ql.io.orc.OrcInputFormat,]
> [OutputFormat,org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat,]
> {code}
> {code:java}
> // Overwriting table
> scala> spark.range(100).write.mode("overwrite").saveAsTable("tbl")
> scala> spark.sql("desc extended tbl").collect.foreach(println)
> [id,bigint,null]
> [,,]
> [# Detailed Table Information,,]
> [Database,default,]
> [Table,tbl,]
> [Owner,karuppayyar,]
> [Created Time,Wed Oct 02 09:30:36 IST 2019,]
> [Last Access,UNKNOWN,]
> [Created By,Spark 3.0.0-SNAPSHOT,]
> [Type,MANAGED,]
> [Provider,parquet,]
> [Location,file:/tmp/tbl,]
> [Serde Library,org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe,]
> [InputFormat,org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat,]
> [OutputFormat,org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat,]
> {code}
>  
>  
> The first code block creates an EXTERNAL table in Orc format
> The second code block overwrites it with more data
> After the overwrite,
> 1. The external table became a managed table.
> 2. The  fileformat has changed from Orc to parquet(default fileformat).
> And other information(like owner, TBLPROPERTIES) are also overwritten.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org