You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Maxim Gekk (JIRA)" <ji...@apache.org> on 2018/11/24 17:45:00 UTC
[jira] [Created] (SPARK-26161) Ignore empty files in load
Maxim Gekk created SPARK-26161:
----------------------------------
Summary: Ignore empty files in load
Key: SPARK-26161
URL: https://issues.apache.org/jira/browse/SPARK-26161
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 2.4.0
Reporter: Maxim Gekk
Currently, empty files are opened in load, and Spark tries to read data from them. In some cases, empty partitions are produced from such empty files. For example, in the case of *wholetext* in Text datasource and *multiLine* modes in CSV/JSON datasource. The behaviour is unnecessary, and empty files can be skipped in read. It can reduce number of tasks submitted for loading empty files.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org