You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean R. Owen (Jira)" <ji...@apache.org> on 2022/12/05 14:47:00 UTC

[jira] [Resolved] (SPARK-40286) Load Data from S3 deletes data source file

     [ https://issues.apache.org/jira/browse/SPARK-40286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean R. Owen resolved SPARK-40286.
----------------------------------
    Resolution: Not A Problem

> Load Data from S3 deletes data source file
> ------------------------------------------
>
>                 Key: SPARK-40286
>                 URL: https://issues.apache.org/jira/browse/SPARK-40286
>             Project: Spark
>          Issue Type: Question
>          Components: Documentation
>    Affects Versions: 3.2.1
>            Reporter: Drew
>            Priority: Major
>
> Hello, 
> I'm using spark to [load data|https://spark.apache.org/docs/latest/sql-ref-syntax-dml-load.html] into a hive table through Pyspark, and when I load data from a path in Amazon S3, the original file is getting wiped from the Directory. The file is found, and is populating the table with data. I also tried to add the `Local` clause but that throws an error when looking for the file. When looking through the documentation it doesn't explicitly state that this is the intended behavior.
> Thanks in advance!
> {code:java}
> spark.sql("CREATE TABLE src (key INT, value STRING) STORED AS textfile")
> spark.sql("LOAD DATA INPATH 's3://bucket/kv1.txt' OVERWRITE INTO TABLE src"){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org