You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ian Cook (Jira)" <ji...@apache.org> on 2021/03/25 00:57:00 UTC

[jira] [Commented] (ARROW-9657) [R][Dataset] Expose more FileSystemDatasetFactory options

    [ https://issues.apache.org/jira/browse/ARROW-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308263#comment-17308263 ] 

Ian Cook commented on ARROW-9657:
---------------------------------

I will break this up into subtasks, beginning with accepting file path(s) in ARROW-12082

> [R][Dataset] Expose more FileSystemDatasetFactory options
> ---------------------------------------------------------
>
>                 Key: ARROW-9657
>                 URL: https://issues.apache.org/jira/browse/ARROW-9657
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: R
>            Reporter: Neal Richardson
>            Assignee: Ian Cook
>            Priority: Major
>              Labels: dataset
>             Fix For: 4.0.0
>
>
> Among the features:
> * ignore_prefixes option
> * Pass an explicit list of files + base directory
> * Exclude invalid files (boolean) option
> An important use case this would allow/fix is being able to open_dataset("really_really_big_file.csv") so you can partition/write it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)