You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Evgenii Samusenko (Jira)" <ji...@apache.org> on 2020/09/29 15:37:00 UTC

[jira] [Created] (SPARK-33025) Empty file for the first partition

Evgenii Samusenko created SPARK-33025:
-----------------------------------------

             Summary: Empty file for the first partition
                 Key: SPARK-33025
                 URL: https://issues.apache.org/jira/browse/SPARK-33025
             Project: Spark
          Issue Type: Bug
          Components: Spark Core, SQL
    Affects Versions: 3.0.1
            Reporter: Evgenii Samusenko


If I create Dataframe with 1 row, Spark will create empty file for the first partition.

 

Example:

val df = Seq(1).toDF("col1").repartition(8)

df1.write.csv("/csv")

 

I got 2 files. The first contains the first partition and the second contains single row from another partition. It is valid also for parquet, text and etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org