You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Apache Arrow JIRA Bot (Jira)" <ji...@apache.org> on 2022/12/08 17:53:00 UTC

[jira] [Commented] (ARROW-17313) [C++] Add Byte Range to CSV Reader ReadOptions

    [ https://issues.apache.org/jira/browse/ARROW-17313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644922#comment-17644922 ] 

Apache Arrow JIRA Bot commented on ARROW-17313:
-----------------------------------------------

This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per [project policy|https://arrow.apache.org/docs/dev/developers/bug_reports.html#issue-assignment]. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

> [C++] Add Byte Range to CSV Reader ReadOptions
> ----------------------------------------------
>
>                 Key: ARROW-17313
>                 URL: https://issues.apache.org/jira/browse/ARROW-17313
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Python
>            Reporter: Ziheng Wang
>            Assignee: Ziheng Wang
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Sometimes it's desirable to just read a portion of a CSV. The best way to do that is to pass in a list of byte ranges to CSV read options that specify where in the CSV you want to read. These byte ranges don't necessarily have to be aligned on line break boundaries, the CSV reader should just read until the end of the line, and skip anything before the first line break in a byte range.  
> Based on discussion, the scope is going to be reduced here. The first implementation will support a single byte range that is already assumed to be aligned on byte boundaries. 
> Will not handle quotes/returns and other edge cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)