You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2016/10/12 00:50:20 UTC

[jira] [Comment Edited] (SPARK-17878) Support for multiple null values when reading CSV data

    [ https://issues.apache.org/jira/browse/SPARK-17878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15567124#comment-15567124 ] 

Hyukjin Kwon edited comment on SPARK-17878 at 10/12/16 12:50 AM:
-----------------------------------------------------------------

Oh, I didn't mean I am against this. I am just wondering if it is just possible to deal with this in general. If it is not easy for now, I'd rather support this idea if we should deal with this problem. (Actually, one of the votes is from me :))


was (Author: hyukjin.kwon):
Oh, I didn't mean I am against this. I am just wondering if it is just possible to deal with this in general. If it is not easy for now, I support this idea.

> Support for multiple null values when reading CSV data
> ------------------------------------------------------
>
>                 Key: SPARK-17878
>                 URL: https://issues.apache.org/jira/browse/SPARK-17878
>             Project: Spark
>          Issue Type: Story
>          Components: SQL
>    Affects Versions: 2.0.1
>            Reporter: Hossein Falaki
>
> There are CSV files out there with multiple values that are supposed to be interpreted as null. As a result, multiple spark users have asked for this feature built into the CSV data source. It can be easily implemented in a backwards compatible way:
> - Currently CSV data source supports an option named {{nullValue}}.
> - We can add logic in {{CSVOptions}} to understands option names that match {{nullValue[\d]}}. This way user can specify a query with multiple or one null value.
> {code}
> val df = spark.read.format("CSV").option("nullValue1", "-").option("nullValue2", "*")....
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org