You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Anas Sherwani <an...@gmail.com> on 2015/07/06 11:15:04 UTC

Spark-CSV: Multiple delimiters and Null fields support

Hi all,

Apparently, we can only specify character delimiter for tokenizing data
using Spark-CSV. But what if we have a log file with multiple delimiters or
even a multi-character delimiter? e.g. (field1,field2:field3) with
delimiters [,:] and (field1::field2::field3) with a single multi-character
delimiter [::].

Further, is there a way to specify null fields? e.g. if the data contains
"\n" in any field, a null should be stored against that field in DataFrame.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-CSV-Multiple-delimiters-and-Null-fields-support-tp23644.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org