You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Egor Litvinenko (JIRA)" <ji...@apache.org> on 2017/07/27 06:18:00 UTC

[jira] [Comment Edited] (FLINK-7274) ParserError NUMERIC_VALUE_FORMAT_ERROR

    [ https://issues.apache.org/jira/browse/FLINK-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102768#comment-16102768 ] 

Egor Litvinenko edited comment on FLINK-7274 at 7/27/17 6:17 AM:
-----------------------------------------------------------------

Hi, Fabian

Thank you for the answer.
In my opinion, it isn't expected behaviour, because "1.2" could be valid number in CSV files.
For instance, OpenCSV processes this case normally, PapaParse too, I'm sure there are more examples of libraries, where this case processed without errors.


was (Author: venicum):
Hi, Fabian

Thank you for the answer.
In my opinion, it isn't expected behaviour, because "1.2" could be valid number in CSV files.
For example, OpenCSV processes this case normally, PapaParse too, I'm sure there are more examples of libraries, where this case processed without errors.

> ParserError NUMERIC_VALUE_FORMAT_ERROR
> --------------------------------------
>
>                 Key: FLINK-7274
>                 URL: https://issues.apache.org/jira/browse/FLINK-7274
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.3.1
>            Reporter: Egor Litvinenko
>              Labels: csvparser
>
> {code:java}
> DataSet<Row> dataSet = env
>                 .readCsvFile("/file/test-data.csv")
>                 .fieldDelimiter(",")
>                 .parseQuotedStrings('"')
>                 .ignoreFirstLine()
>                 .types(String.class, Double.class, Double.class, Double.class, Double.class)
> {code}
> {code:log}
> Caused by: org.apache.flink.api.common.io.ParseException: Line could not be parsed: '"1950-01-01","73.20101635771319","87.25023810870184","36.0149972876981","46.43200584961114"'
> ParserError NUMERIC_VALUE_FORMAT_ERROR 
> Expect field types: class java.lang.String, class java.lang.Double, class java.lang.Double, class java.lang.Double, class java.lang.Double
> {code}
> Test data example:
> "ID","F1","F2","F3","F4"
> "1950-01-01","73.20101635771319","87.25023810870184","36.0149972876981","46.43200584961114"
> "1950-01-02","22.265361054145394","57.02164143464855","67.24219049572051","43.058275223048035"
> "1950-01-03","45.674551461704915","86.35170144091485","16.18842554618568","6.748071385147735"
> "1950-01-04","8.890850738221644","20.490727535158946","58.32831367590852","17.916755029167952"
> "1950-01-05","38.07336923931018","27.223155544419697","92.67895969507504","60.027033750000335"
> If generate this data without quote char, it will be fine.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)