You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "chen xiao (JIRA)" <ji...@apache.org> on 2018/08/22 06:48:00 UTC

[jira] [Updated] (SPARK-25193) insert overwrite doesn't throw exception when drop old data fails

     [ https://issues.apache.org/jira/browse/SPARK-25193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chen xiao updated SPARK-25193:
------------------------------
    Description: 
dataframe.write.mode(SaveMode.Overwrite).insertInto(s"$databaseName.$tableName")

Insert overwrite mode will drop old data in hive table if there's old data.

But if data deleting fails, no exception will be thrown and the data folder will be like:

hdfs://uxs_nbp/nba_score/dt=2018-08-15/seq_num=2/part-00000

hdfs://uxs_nbp/nba_score/dt=2018-08-15/seq_num=2/part-000001534916642513.

Two copies of data will be kept.

  was:
dataframe.write.mode(SaveMode.Overwrite).insertInto(s"$databaseName.$tableName")

Insert overwrite mode will drop old data in hive table if there's old data.

But if data deleting fails, no exception will be thrown and the data folder will be like:

hdfs://uxs_nbp/nba_score/dt=2018-08-15/seq_num=2/part-00000

hdfs://uxs_nbp/nba_score/dt=2018-08-15/seq_num=2/part-000001534916642513.


> insert overwrite doesn't throw exception when drop old data fails
> -----------------------------------------------------------------
>
>                 Key: SPARK-25193
>                 URL: https://issues.apache.org/jira/browse/SPARK-25193
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.2.2
>            Reporter: chen xiao
>            Priority: Major
>
> dataframe.write.mode(SaveMode.Overwrite).insertInto(s"$databaseName.$tableName")
> Insert overwrite mode will drop old data in hive table if there's old data.
> But if data deleting fails, no exception will be thrown and the data folder will be like:
> hdfs://uxs_nbp/nba_score/dt=2018-08-15/seq_num=2/part-00000
> hdfs://uxs_nbp/nba_score/dt=2018-08-15/seq_num=2/part-000001534916642513.
> Two copies of data will be kept.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org