You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Nate Clark (Jira)" <ji...@apache.org> on 2021/05/07 13:58:00 UTC

[jira] [Updated] (ARROW-12661) [C++] CSV add skip rows after column names

     [ https://issues.apache.org/jira/browse/ARROW-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nate Clark updated ARROW-12661:
-------------------------------
    Description: 
Some programs generate csv files with additional descriptive information about the columns on a row after the names. For files like this it would be nice to have an option which reads the first row as column names and then can skip those rows after the names.

This could probably be implemented easily as either another option parallel ReadOptions::skip_rows or a boolean which indicates if skipping should occur before or after the column names are read.

  was:
Some programs generator csv files with additional descriptive information about the columns on a row after the names. For files like this it would be nice to have an option which reads the first row as column names and then can skip those rows after the names.

This could probably be implemented easily as either another option parallel ReadOptions::skip_rows or a boolean which indicates if skipping should occur before or after the column names are read.


> [C++] CSV add skip rows after column names
> ------------------------------------------
>
>                 Key: ARROW-12661
>                 URL: https://issues.apache.org/jira/browse/ARROW-12661
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Nate Clark
>            Priority: Major
>
> Some programs generate csv files with additional descriptive information about the columns on a row after the names. For files like this it would be nice to have an option which reads the first row as column names and then can skip those rows after the names.
> This could probably be implemented easily as either another option parallel ReadOptions::skip_rows or a boolean which indicates if skipping should occur before or after the column names are read.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)