You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "pengfei zhao (Jira)" <ji...@apache.org> on 2022/11/10 06:03:00 UTC

[jira] [Created] (SPARK-41094) The saveAsTable method fails to be executed, resulting in data file loss

pengfei zhao created SPARK-41094:
------------------------------------

             Summary: The saveAsTable method fails to be executed, resulting in data file loss
                 Key: SPARK-41094
                 URL: https://issues.apache.org/jira/browse/SPARK-41094
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.4.4
            Reporter: pengfei zhao


We have a problem in the production environment. 
The code is: df.write.mode(SaveMode.Overwrite).saveAsTable("xxx").
When the saveAsTable method is executing, an executor exits due to OOM, causing half of the data file to be written on the hdfs, but subsequent spark retries fail again. 

It is very similar to the scenario described in SPARK-22504, but it really happened.

I read the source code. Why does Spark need to delete the table first and then execute the plan? What if the execution fails after deleting the table?

I know the attitude of the community, but this method of deleting tables first is too risky. Can we adopt the following processing methods like Hive,
1. WRITE: create and write data to tempTable
2. SWAP: swap temptable1 with targetTable by using rename operation
3. CLEAN: clean up old data



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org