You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@arrow.apache.org by "Jonathan Keane (Jira)" <ji...@apache.org> on 2021/11/17 14:17:00 UTC

[jira] [Updated] (ARROW-14428) [R] [C++] Allow me to write_parquet() from an arrow_dplyr_query

     [ https://issues.apache.org/jira/browse/ARROW-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Keane updated ARROW-14428:
-----------------------------------
    Component/s: C++

> [R] [C++] Allow me to write_parquet() from an arrow_dplyr_query 
> ----------------------------------------------------------------
>
>                 Key: ARROW-14428
>                 URL: https://issues.apache.org/jira/browse/ARROW-14428
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, R
>            Reporter: Jonathan Keane
>            Priority: Major
>
> Right now, I can:
> {code}
> ds <- open_dataset("some.parquet")
> ds %>% 
>   mutate(
>     o_orderdate = cast(o_orderdate, date32())  
>   ) %>% 
>   write_dataset(path = "new.parquet")
> {code}
> but I can't:
> {code}
> tab <- read_parquet("some.parquet", as_data_frame = FALSE)
> tab %>% 
>   mutate(
>     o_orderdate = cast(o_orderdate, date32())  
>   ) %>% 
>   write_parquet("new.parquet")
> {code}
> In this case, I can cast the column as a separate command and then {{write_parquet()}} after, but it would be nice to be able to us `write_parquet()` in a pipeline.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)