You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2022/09/16 14:50:00 UTC

[jira] [Closed] (ARROW-16575) [R] arrow::write_dataset() does nothing with 0 row dataframes in R

     [ https://issues.apache.org/jira/browse/ARROW-16575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neal Richardson closed ARROW-16575.
-----------------------------------
    Resolution: Information Provided

> [R] arrow::write_dataset() does nothing with 0 row dataframes in R
> ------------------------------------------------------------------
>
>                 Key: ARROW-16575
>                 URL: https://issues.apache.org/jira/browse/ARROW-16575
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>         Environment: Mac OS 12.3, R 4.1
>            Reporter: Adam Black
>            Priority: Minor
>
> In R a dataframe can have 0 rows. It still has column names and types. 
>  
> Expected behavior of arrow::write_dataset
> I would expect that it would be possible to have a FileSystemDataset with zero rows that would contain metadata about the column names and types. arrow::write_dataset would create the FileSystemDataset metadata when given a dataframe with zero rows.
>  
> Actual behavior
> arrow::write_dataset() does nothing when passed a dataframe with zero rows.
>  
> Reproducible example using the current arrow package on CRAN
> {code:java}
> arrow::write_dataset(cars, here::here("cars"))
> arrow::open_dataset(here::here("cars"))
> #> FileSystemDataset with 1 Parquet file
> #> speed: double
> #> dist: double
> #> 
> #> See $metadata for additional Schema metadata
> file.exists(here::here("cars"))
> #> [1] TRUE
> df <- cars[cars$speed > 1000, ]
> nrow(df)
> #> [1] 0
> arrow::write_dataset(df, here::here("df"), format = "feather")
> arrow::open_dataset(here::here("df"))
> #> Error: IOError: Cannot list directory '/private/var/folders/xx/01v98b6546ldnm1rg1_bvk000000gn/T/RtmpGkX0gK/reprex-17c305ed29ad5-nerdy-ram/df'. Detail: [errno 2] No such file or directory
> file.exists(here::here("df"))
> #> [1] FALSE{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)