You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Yevgen Galchenko (JIRA)" <ji...@apache.org> on 2017/07/03 16:04:00 UTC

[jira] [Created] (SPARK-21289) Text and CSV formats do not support custom end-of-line delimiters

Yevgen Galchenko created SPARK-21289:
----------------------------------------

             Summary: Text and CSV formats do not support custom end-of-line delimiters
                 Key: SPARK-21289
                 URL: https://issues.apache.org/jira/browse/SPARK-21289
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.1.1
            Reporter: Yevgen Galchenko
            Priority: Minor


Spark csv and text readers always use default CR, LF or CRLF line terminators without an option to configure a custom delimiter.
Option "textinputformat.record.delimiter" is not being used to set delimiter in HadoopFileLinesReader and can only be set for Hadoop RDD when textFile() is used to read file.
Possible solution would be to change HadoopFileLinesReader and create LineRecordReader with delimiters specified in configuration. LineRecordReader already supports passing recordDelimiter in its constructor.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org