You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "gongcastro (via GitHub)" <gi...@apache.org> on 2023/05/04 12:27:11 UTC
[GitHub] [arrow] gongcastro opened a new issue, #35431: [R] Error when creating a sequence with`n()`
gongcastro opened a new issue, #35431:
URL: https://github.com/apache/arrow/issues/35431
### Describe the bug, including details regarding any error messages, version, and platform.
Hi! I wanted to create a variable in a data frame with the cumulative counts of some other variable.
Without using Arrow, I get what I need:
```r
library(dplyr)
library(tibble)
mtcars |>
rownames_to_column("model") |>
select(model, cyl) |>
group_by(cyl) |>
mutate(seq_counts = 1:n())
```
Which returns:
```
# A tibble: 32 × 3
model cyl seq_counts
<chr> <dbl> <int>
1 Mazda RX4 6 1
2 Mazda RX4 Wag 6 2
3 Datsun 710 4 1
4 Hornet 4 Drive 6 3
5 Hornet Sportabout 8 1
6 Valiant 6 4
7 Duster 360 8 2
8 Merc 240D 4 2
9 Merc 230 4 3
10 Merc 280 6 5
```
Since Arrow does not support `n()` yet, I'm using `to_duckdb()` to continue the pipeline (I'm using `mtcars` here for minimal reproducibility, but my actual dataset is way bigger, therefore the need to use Arrow/DuckDB). But when using the same code after `to_duckdb()`, I get the following error:
```r
mtcars |>
rownames_to_column("model") |>
to_duckdb() |>
select(model, cyl) |>
group_by(cyl) |>
mutate(seq_counts = 1:n())
```
```
Error in `purrr::pmap()`:
ℹ In index: 3.
ℹ With name: seq_counts.
Caused by error in `from:to`:
! NA/NaN argument
Run `rlang::last_trace()` to see where the error occurred.
Warning message:
In 1:n() : NAs introduced by coercion
```
I encouter the same error when defining n() in a different variable (e.g., `mutate(n_total = n(), seq_counts = 1:n_total)`, and when using `seq()` instead of `:` to make the sequence.
Thanks!
This is my `sessionInfo()`:
```
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Spain.utf8 LC_CTYPE=Spanish_Spain.utf8
[3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] arrow_11.0.0.3 tibble_3.2.1 dplyr_1.1.2 devtools_2.4.3 usethis_2.1.5
loaded via a namespace (and not attached):
[1] pillar_1.9.0 compiler_4.2.2 dbplyr_2.1.1 prettyunits_1.1.1
[5] remotes_2.4.2 tools_4.2.2 pkgbuild_1.3.1 pkgload_1.3.2
[9] bit_4.0.5 memoise_2.0.1 lifecycle_1.0.3 pkgconfig_2.0.3
[13] rlang_1.1.0 cli_3.6.0 DBI_1.1.3 fastmap_1.1.0
[17] duckdb_0.7.1-1 withr_2.5.0 generics_0.1.3 fs_1.5.2
[21] vctrs_0.6.2 bit64_4.0.5 tidyselect_1.2.0 glue_1.6.2
[25] R6_2.5.1 processx_3.8.1 fansi_1.0.3 sessioninfo_1.2.2
[29] callr_3.7.3 purrr_1.0.1 tzdb_0.3.0 blob_1.2.3
[33] magrittr_2.0.3 ps_1.7.5 ellipsis_0.3.2 assertthat_0.2.1
[37] utf8_1.2.2 cachem_1.0.6 crayon_1.5.2
```
### Component(s)
R
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] thisisnic closed issue #35431: [R] Error when creating a sequence with`n()`
Posted by "thisisnic (via GitHub)" <gi...@apache.org>.
thisisnic closed issue #35431: [R] Error when creating a sequence with`n()`
URL: https://github.com/apache/arrow/issues/35431
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] thisisnic commented on issue #35431: [R] Error when creating a sequence with`n()`
Posted by "thisisnic (via GitHub)" <gi...@apache.org>.
thisisnic commented on issue #35431:
URL: https://github.com/apache/arrow/issues/35431#issuecomment-1534913273
Thanks for reporting this @gongcastro! Once you call `to_duckdb()`, this converts the object to a virtual DuckDB table, so the error you're having likely doesn't reside within the Arrow codebase, so you might be best opening up an issue on [the DuckDB repo](https://github.com/duckdb/duckdb/issues).
I've pasted a reprex below which shows this error being recreated using just duckdb without arrow:
``` r
library(duckdb)
library(dplyr)
# with dplyr
mtcars %>%
group_by(am) |>
mutate(seq_counts = 1:n()) |>
collect()
#> # A tibble: 32 × 12
#> # Groups: am [2]
#> mpg cyl disp hp drat wt qsec vs am gear carb seq_counts
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 1
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4 2
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1 3
#> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1 1
#> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2 2
#> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1 3
#> 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4 4
#> 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2 5
#> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2 6
#> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4 7
#> # ℹ 22 more rows
# with DuckDB
con <- dbConnect(duckdb::duckdb(), dbdir = ":memory:")
duckdb::duckdb_register(con, "mtcars", mtcars)
tbl(con, "mtcars") |>
group_by(am) |>
mutate(seq_counts = 1:n()) |>
collect()
#> Warning in 1:n(): NAs introduced by coercion
#> Error in `purrr::pmap()`:
#> ℹ In index: 2.
#> Caused by error in `from:to`:
#> ! NA/NaN argument
#> Backtrace:
#> ▆
#> 1. ├─dplyr::collect(mutate(group_by(tbl(con, "mtcars"), am), seq_counts = 1:n()))
#> 2. ├─dbplyr:::collect.tbl_sql(...)
#> 3. │ ├─dbplyr::db_sql_render(x$src$con, x, cte = cte)
#> 4. │ └─dbplyr:::db_sql_render.DBIConnection(x$src$con, x, cte = cte)
#> 5. │ ├─dbplyr::sql_render(sql, con = con, ..., cte = cte)
#> 6. │ └─dbplyr:::sql_render.tbl_lazy(sql, con = con, ..., cte = cte)
#> 7. │ ├─dbplyr::sql_render(...)
#> 8. │ └─dbplyr:::sql_render.lazy_query(...)
#> 9. │ ├─dbplyr::sql_build(query, con = con, ...)
#> 10. │ └─dbplyr:::sql_build.lazy_select_query(query, con = con, ...)
#> 11. │ └─dbplyr:::get_select_sql(...)
#> 12. │ └─dbplyr:::translate_select_sql(con, select)
#> 13. │ └─purrr::pmap(...)
#> 14. │ └─purrr:::pmap_("list", .l, .f, ..., .progress = .progress)
#> 15. │ ├─purrr:::with_indexed_errors(...)
#> 16. │ │ └─base::withCallingHandlers(...)
#> 17. │ └─dbplyr (local) .f(...)
#> 18. │ └─dbplyr::translate_sql_(...)
#> 19. │ └─base::lapply(...)
#> 20. │ └─dbplyr (local) FUN(X[[i]], ...)
#> 21. │ ├─dbplyr::escape(eval_tidy(x, mask), con = con)
#> 22. │ └─rlang::eval_tidy(x, mask)
#> 23. ├─1:n()
#> 24. └─base::.handleSimpleError(`<fn>`, "NA/NaN argument", base::quote(from:to))
#> 25. └─purrr (local) h(simpleError(msg, call))
#> 26. └─cli::cli_abort(c(i = "In index: {i}."), parent = cnd, call = error_call)
#> 27. └─rlang::abort(...)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org