You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jonathan Keane (Jira)" <ji...@apache.org> on 2021/11/17 14:17:00 UTC
[jira] [Updated] (ARROW-14428) [R] [C++] Allow me to write_parquet() from an arrow_dplyr_query
[ https://issues.apache.org/jira/browse/ARROW-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Keane updated ARROW-14428:
-----------------------------------
Component/s: C++
> [R] [C++] Allow me to write_parquet() from an arrow_dplyr_query
> ----------------------------------------------------------------
>
> Key: ARROW-14428
> URL: https://issues.apache.org/jira/browse/ARROW-14428
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, R
> Reporter: Jonathan Keane
> Priority: Major
>
> Right now, I can:
> {code}
> ds <- open_dataset("some.parquet")
> ds %>%
> mutate(
> o_orderdate = cast(o_orderdate, date32())
> ) %>%
> write_dataset(path = "new.parquet")
> {code}
> but I can't:
> {code}
> tab <- read_parquet("some.parquet", as_data_frame = FALSE)
> tab %>%
> mutate(
> o_orderdate = cast(o_orderdate, date32())
> ) %>%
> write_parquet("new.parquet")
> {code}
> In this case, I can cast the column as a separate command and then {{write_parquet()}} after, but it would be nice to be able to us `write_parquet()` in a pipeline.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)