You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Kurt Young (JIRA)" <ji...@apache.org> on 2017/02/25 04:26:44 UTC

[jira] [Commented] (FLINK-5907) RowCsvInputFormat bug on parsing tsv

    [ https://issues.apache.org/jira/browse/FLINK-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884013#comment-15884013 ] 

Kurt Young commented on FLINK-5907:
-----------------------------------

Looks like RowCsvInputFormat didn't handle rows which ended with field delimiter correctly.

> RowCsvInputFormat bug on parsing tsv
> ------------------------------------
>
>                 Key: FLINK-5907
>                 URL: https://issues.apache.org/jira/browse/FLINK-5907
>             Project: Flink
>          Issue Type: Bug
>          Components: Java API
>    Affects Versions: 1.2.0
>            Reporter: Flavio Pompermaier
>            Assignee: Kurt Young
>              Labels: csv, parsing
>         Attachments: test.tsv
>
>
> The following snippet reproduce the problem (using the attached file as input):
> {code:language=java}
> char fieldDelim = '\t';
>     TypeInformation<?>[] fieldTypes = new TypeInformation<?>[51];
>     for (int i = 0; i < fieldTypes.length; i++) {
>       fieldTypes[i] = BasicTypeInfo.STRING_TYPE_INFO;
>     }
>     int[] fieldMask = new int[fieldTypes.length];
>     for (int i = 0; i < fieldMask.length; i++) {
>       fieldMask[i] = i;
>     }
>     RowCsvInputFormat csvIF = new RowCsvInputFormat(new Path(testCsv), fieldTypes, "\n", fieldDelim +"", 
>        fieldMask, true);
>     csvIF.setNestedFileEnumeration(true);
>     DataSet<Row> csv = env.createInput(csvIF);
>    csv.print()
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)