You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2018/11/29 16:32:00 UTC

[jira] [Resolved] (SPARK-26081) Do not write empty files by text datasources

     [ https://issues.apache.org/jira/browse/SPARK-26081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-26081.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 3.0.0

Issue resolved by pull request 23052
[https://github.com/apache/spark/pull/23052]

> Do not write empty files by text datasources
> --------------------------------------------
>
>                 Key: SPARK-26081
>                 URL: https://issues.apache.org/jira/browse/SPARK-26081
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Maxim Gekk
>            Assignee: Maxim Gekk
>            Priority: Minor
>             Fix For: 3.0.0
>
>
> Text based datasources like CSV, JSON and Text produces empty files for empty partitions. This introduces additional overhead while opening and reading such files back. In current implementation of OutputWriter, the output stream are created eagerly even no records are written to the stream. So, creation can be postponed up to the first write.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org