You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Neal Richardson (JIRA)" <ji...@apache.org> on 2019/07/31 15:09:00 UTC

[jira] [Comment Edited] (ARROW-6004) [C++] CSV reader ignore_empty_lines option doesn't handle empty lines

    [ https://issues.apache.org/jira/browse/ARROW-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897263#comment-16897263 ] 

Neal Richardson edited comment on ARROW-6004 at 7/31/19 3:08 PM:
-----------------------------------------------------------------

It's an option many (most? all?) of the other common CSV readers have, so we should have it too. File readers, especially CSV, need to support lots of funky data shapes because real data is usually dirty and it's not feasible to tell people to rewrite their data (or even if you could say that, well, you have to read it in order to be able to rewrite it, right?)


was (Author: npr):
It's an option many (most? all?) of the other common CSV readers have, so we should have it too.

> [C++] CSV reader ignore_empty_lines option doesn't handle empty lines
> ---------------------------------------------------------------------
>
>                 Key: ARROW-6004
>                 URL: https://issues.apache.org/jira/browse/ARROW-6004
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Neal Richardson
>            Assignee: Antoine Pitrou
>            Priority: Minor
>              Labels: csv, pull-request-available
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Followup to https://issues.apache.org/jira/browse/ARROW-5747. If {{ignore_empty_lines}} is false and there are empty lines, it fails to parse (again, with {{Invalid: Empty CSV file}}).
> Correct behavior should be to fill those empty lines with missing data for all columns.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)