You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Nicola Crane (Jira)" <ji...@apache.org> on 2021/12/14 17:11:00 UTC

[jira] [Updated] (ARROW-15040) [R] Enable write_csv_arrow to take a RecordBatchReader as input

     [ https://issues.apache.org/jira/browse/ARROW-15040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicola Crane updated ARROW-15040:
---------------------------------
    Description: 
Currently, this code fails:
{code:r}
dataset <- open_dataset("some/folder/with/parquet/files")
write_csv_arrow(dataset, sink = "dataset.csv")
{code}
with this error message:
{code:r}
Error: x must be an object of class 'data.frame', 'RecordBatch', or 'Table', not 'FileSystemDataset'.
{code}
In ARROW-14741, support was added for reading from a RecordBatchReader, so we should be able to now extend {{write_csv_arrow()}} to allow this behaviour.

 

Note: We would need to make sure whatever write_csv(record_batch_reader) function can take a filesystem= argument

  was:
Currently, this code fails:
{code:r}
dataset <- open_dataset("some/folder/with/parquet/files")
write_csv_arrow(dataset, sink = "dataset.csv")
{code}

with this error message:
{code:r}
Error: x must be an object of class 'data.frame', 'RecordBatch', or 'Table', not 'FileSystemDataset'.
{code}

In ARROW-14741, support was added for reading from a RecordBatchReader, so we should be able to now extend {{write_csv_arrow()}} to allow this behaviour.


> [R] Enable write_csv_arrow to take a RecordBatchReader as input
> ---------------------------------------------------------------
>
>                 Key: ARROW-15040
>                 URL: https://issues.apache.org/jira/browse/ARROW-15040
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Nicola Crane
>            Priority: Major
>
> Currently, this code fails:
> {code:r}
> dataset <- open_dataset("some/folder/with/parquet/files")
> write_csv_arrow(dataset, sink = "dataset.csv")
> {code}
> with this error message:
> {code:r}
> Error: x must be an object of class 'data.frame', 'RecordBatch', or 'Table', not 'FileSystemDataset'.
> {code}
> In ARROW-14741, support was added for reading from a RecordBatchReader, so we should be able to now extend {{write_csv_arrow()}} to allow this behaviour.
>  
> Note: We would need to make sure whatever write_csv(record_batch_reader) function can take a filesystem= argument



--
This message was sent by Atlassian Jira
(v8.20.1#820001)