You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2021/10/26 12:29:00 UTC

[jira] [Commented] (ARROW-12681) [Python] Expose IpcReadOptions to ipc facility

    [ https://issues.apache.org/jira/browse/ARROW-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434325#comment-17434325 ] 

Joris Van den Bossche commented on ARROW-12681:
-----------------------------------------------

In the context of ARROW-14470 for the Feather reader, we have been looking a bit into the IpcReadOptions.

Some observations / questions:

- For writing, we already expose the IpcWriteOptions in Python (so also exposing IpcReadOptions would be consistent with that), although I agree adding a {{columns}} keyword would be more user friendly. 
- Typically (for other readers we have), such a {{columns}} keyword for only reading a subset is exposed in the "read" function. But for RecordBatchFileReader, the options are passed when opening the reader. So in the Python API it would rather be {{RecordBatchFileReader(source, columns=...).read_all()}} instead of {{RecordBatchFileReader(source).read_all(columns=...)}}. Are we OK with that discrepancy on the Python side? 

> [Python] Expose IpcReadOptions to ipc facility
> ----------------------------------------------
>
>                 Key: ARROW-12681
>                 URL: https://issues.apache.org/jira/browse/ARROW-12681
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Francois Saint-Jacques
>            Priority: Minor
>
> I would like to be able to read only a subset of columns from a given IPC file. To do this, we need to expose the EXPERIMENTAL (is it still?) IpcReaderOptions.include_fields option. The reason is that the file is on a remote storage and can't mmap thus I want to minimize network transfer.
> I do not know the best way to "pythonize" IpcReaderOptions and would need help on this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)