You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/08/15 13:08:25 UTC
[GitHub] [arrow] paleolimbot commented on pull request #13789: ARROW-14071: [R] Try to arrow_eval user-defined functions
paleolimbot commented on PR #13789:
URL: https://github.com/apache/arrow/pull/13789#issuecomment-1214992856
This is very cool! It's the most important type of user-defined function because it's 100% translatable using Arrow kernels so it runs in parallel...a lot of applications will benefit from this!
Have you considered adding a registration step? If you do, you may be able to simplify some of this. The dream, of course, is to not require pre-registration at all, which will require an approach much like the one you've sketched out here, (i.e., preprocessing the expression).
<details>
``` r
library(dplyr, warn.conflicts = FALSE)
library(arrow, warn.conflicts = FALSE)
#> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.
register_user_binding <- function(name, f, env = rlang::caller_env()) {
# copy the bindings environment because we don't want to set the parent
# of the one-and-only official bindings environment
bindings_env <- as.environment(as.list(arrow:::nse_funcs))
parent.env(bindings_env) <- env
environment(f) <- bindings_env
# register for use in Arrow (non-agg)
arrow:::register_binding(name, f, update_cache = TRUE)
# in case this is a recursive function
arrow:::register_binding(name, f, bindings_env)
# so that the user can call this function, too (most Arrow bindings accept
# regular input, too)
invisible(f)
}
nchar2 <- register_user_binding("nchar2", function(x) {
1 + nchar(x)
})
record_batch(my_string = "1234") %>%
mutate(
var1 = nchar(my_string),
var2 = nchar2(my_string)) %>%
collect()
#> # A tibble: 1 × 3
#> my_string var1 var2
#> <chr> <int> <dbl>
#> 1 1234 4 5
```
<sup>Created on 2022-08-15 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup>
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org