You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "eitsupi (via GitHub)" <gi...@apache.org> on 2023/05/05 09:12:09 UTC

[GitHub] [arrow] eitsupi opened a new issue, #35445: [R] Behavior something like `group_by(foo) |> across(everything())` is different from dplyr

eitsupi opened a new issue, #35445:
URL: https://github.com/apache/arrow/issues/35445

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   In dplyr, I believe that using `across(everything())` on a grouped data frame will not select the column used for grouping.
   
   ``` r
   mtcars |>
     dplyr::group_by(cyl) |>
     dplyr::summarise(dplyr::across(everything(), sum))
   #> # A tibble: 3 × 11
   #>     cyl   mpg  disp    hp  drat    wt  qsec    vs    am  gear  carb
   #>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
   #> 1     4  293. 1156.   909  44.8  25.1  211.    10     8    45    17
   #> 2     6  138. 1283.   856  25.1  21.8  126.     4     3    27    24
   #> 3     8  211. 4943.  2929  45.2  56.0  235.     0     2    46    49
   ```
   
   <sup>Created on 2023-05-05 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup>
   
   However, arrow does not seem to exclude the columns used for grouping. The following example results in an error.
   (I installed arrow 12.0.0.20230503 from R-universe)
   
   ``` r
   mtcars |>
     arrow::as_arrow_table() |>
     dplyr::group_by(cyl) |>
     dplyr::summarise(dplyr::across(everything(), sum)) |>
     dplyr::collect()
   #> Error in `compute.arrow_dplyr_query()`:
   #> ! Invalid: Multiple matches for FieldRef.Name(cyl) in mpg: double
   #> cyl: double
   #> disp: double
   #> hp: double
   #> drat: double
   #> wt: double
   #> qsec: double
   #> vs: double
   #> am: double
   #> gear: double
   #> carb: double
   #> cyl: double
   #> Backtrace:
   #>     ▆
   #>  1. ├─dplyr::collect(...)
   #>  2. └─arrow:::collect.arrow_dplyr_query(...)
   #>  3.   └─arrow:::compute.arrow_dplyr_query(x)
   #>  4.     └─base::tryCatch(...)
   #>  5.       └─base (local) tryCatchList(expr, classes, parentenv, handlers)
   #>  6.         └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
   #>  7.           └─value[[3L]](cond)
   #>  8.             └─arrow:::augment_io_error_msg(e, call, schema = schema())
   #>  9.               └─rlang::abort(msg, call = call)
   ```
   
   <sup>Created on 2023-05-05 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup>
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on issue #35445: [R] Behavior something like `group_by(foo) |> across(everything())` is different from dplyr

Posted by "thisisnic (via GitHub)" <gi...@apache.org>.
thisisnic commented on issue #35445:
URL: https://github.com/apache/arrow/issues/35445#issuecomment-1536487182

   Thanks for reporting this @eitsupi; can confirm this is reproducible and is a bug we should fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic closed issue #35445: [R] Behavior something like `group_by(foo) |> across(everything())` is different from dplyr

Posted by "thisisnic (via GitHub)" <gi...@apache.org>.
thisisnic closed issue #35445: [R] Behavior something like `group_by(foo) |> across(everything())` is different from dplyr
URL: https://github.com/apache/arrow/issues/35445


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org