You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Dewey Dunnington (Jira)" <ji...@apache.org> on 2022/10/07 19:21:00 UTC

[jira] [Resolved] (ARROW-17738) [R] dplyr::compute should convert from grouped arrow_dplyr_query to arrow Table

     [ https://issues.apache.org/jira/browse/ARROW-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dewey Dunnington resolved ARROW-17738.
--------------------------------------
    Fix Version/s: 10.0.0
       Resolution: Fixed

Issue resolved by pull request 14160
[https://github.com/apache/arrow/pull/14160]

> [R] dplyr::compute should convert from grouped arrow_dplyr_query to arrow Table
> -------------------------------------------------------------------------------
>
>                 Key: ARROW-17738
>                 URL: https://issues.apache.org/jira/browse/ARROW-17738
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 9.0.0
>            Reporter: SHIMA Tatsuya
>            Assignee: SHIMA Tatsuya
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 10.0.0
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> It is expected that {{dplyr::compute()}} will perform the calculation on the arrow dplyr query and convert it to a Table, but it does not seem to work correctly for grouped arrow dplyr queries and does not result in a Table.
> {code:r}
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::compute() |> class()
> #> [1] "arrow_dplyr_query"
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::ungroup() |> dplyr::compute() |> class()
> #> [1] "Table"        "ArrowTabular" "ArrowObject"  "R6"
> {code}
> {{as_arrow_table()}} works fine.
> {code:r}
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> class()
> #> [1] "arrow_dplyr_query"
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::compute() |> class()
> #> [1] "arrow_dplyr_query"
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::collect(FALSE) |> class()
> #> [1] "arrow_dplyr_query"
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> arrow::as_arrow_table() |> class()
> #> [1] "Table"        "ArrowTabular" "ArrowObject"  "R6"
> {code}
> It seems to revert to arrow dplyr query in the following line.
> [https://github.com/apache/arrow/blob/7cfdfbb0d5472f8f8893398b51042a3ca1dd0adf/r/R/dplyr-collect.R#L73-L75]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)