You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/14 23:04:37 UTC

[GitHub] [arrow] dragosmg opened a new pull request, #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

dragosmg opened a new pull request, #13160:
URL: https://github.com/apache/arrow/pull/13160

   This PR will allow the use of namespacing with bindings:
   ``` r
   library(arrow, warn.conflicts = FALSE)
   library(dplyr, warn.conflicts = FALSE)
   library(lubridate, warn.conflicts = FALSE)
   
   test_df <- tibble(
     date = as.Date(c("2022-03-22", "2021-07-30", NA))
   )
   
   test_df %>%
     mutate(ddate = lubridate::as_datetime(date)) %>%
     collect()
   #> # A tibble: 3 × 2
   #>   date       ddate              
   #>   <date>     <dttm>             
   #> 1 2022-03-22 2022-03-22 00:00:00
   #> 2 2021-07-30 2021-07-30 00:00:00
   #> 3 NA         NA
   
   test_df %>%
     arrow_table() %>% 
     mutate(ddate = lubridate::as_datetime(date)) %>%
     collect()
   #> # A tibble: 3 × 2
   #>   date       ddate              
   #>   <date>     <dttm>             
   #> 1 2022-03-22 2022-03-22 00:00:00
   #> 2 2021-07-30 2021-07-30 00:00:00
   #> 3 NA         NA
   ```
   
   <sup>Created on 2022-05-14 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup> 
   
   The approach:
   * change the naming convention for bindings, from `"as_datetime"` to `"lubridate_as_datetime"`, for 2 reasons:
     * it gives us an indication of the package / namespace where linking to
     * it avoids the use of the double-colon (`::`) operator (easier dealing with strings only)
   * we register each binding both with its _short_ name (`"as_datetime"`) and its _full_ name (`"lubridate_as_datetime"`)
   * we change the expressions supplied by the user by replacing the double colon (`::`) with an underscore (`_`)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921530667


##########
r/tests/testthat/test-util.R:
##########
@@ -39,3 +39,34 @@ test_that("as_writable_table() errors for invalid input", {
   # make sure other errors make it through
   expect_snapshot_error(wrapper_fun(data.frame(x = I(list(1, "a")))))
 })
+
+test_that("all_funs() identifies namespace-qualified and unqualified functions", {
+  expect_equal(
+    arrow:::all_funs(rlang::quo(pkg::fun())),

Review Comment:
   💯 Oversight on my side



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922133524


##########
r/tests/testthat/test-dplyr-mutate.R:
##########
@@ -140,17 +140,6 @@ test_that("transmute() with unsupported arguments", {
   )
 })
 
-test_that("transmute() defuses dots arguments (ARROW-13262)", {

Review Comment:
   I don't think we need it. We solved the underlying issue (transmute not defusing the ellipsis arg). If we want the test in, we need a function for which the fallback to R works, so it can't be a made up function. We do not have a corresponding test for `mutate()`, for example. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916839256


##########
r/R/util.R:
##########
@@ -93,6 +93,13 @@ all_funs <- function(expr) {
     expr <- quo_get_expr(expr)
   }
   names <- all.names(expr)
+  # if we have namespace-qualified functions rebuild the function name with the
+  # pkg:: prefix
+  for (i in seq_along(names)) {
+    if (names[i] == "::") {
+      names[i] <- paste0(names[i+1], names[i], names[i+2])
+    }
+  }

Review Comment:
   Yep, adding tests is on the to-do list. I will mark this as draft in the meantime.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
thisisnic commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873687746


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   Sure - I guess what I'm envisioning involves being able to achieve that but via `registry` being more complicated than a named list, so some object which has the function's name, namespace, and the function body itself, **but** I'm not advocating for unnecessarily complicating things this way without good reason.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916866570


##########
r/R/dplyr-funcs.R:
##########
@@ -58,13 +58,20 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  if (fun_name == "::") {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921542527


##########
r/R/dplyr-funcs.R:
##########
@@ -116,3 +136,18 @@ create_binding_cache <- function() {
 nse_funcs <- new.env(parent = emptyenv())
 agg_funcs <- new.env(parent = emptyenv())
 .cache <- new.env(parent = emptyenv())
+
+# we register 2 version of the "::" binding - one for use with nse_funcs (below)
+# and another one for use with agg_funcs (in dplyr-summarize.R)
+register_bindings_utils <- function() {
+  register_binding("::", function(lhs, rhs) {
+    lhs_name <- as.character(substitute(lhs))
+    rhs_name <- as.character(substitute(rhs))
+
+    fun_name <- paste0(lhs_name, "::", rhs_name)
+
+    # if we do not have a binding for pkg::fun, then fall back on to the
+    # regular pkg::fun function
+    nse_funcs[[fun_name]] %||% asNamespace(lhs_name)[[rhs_name]]
+  })

Review Comment:
   Thanks. Love the simplification. It makes a lot of sense. It's sometimes weird how I don't see some things when I'm too close to the code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921529807


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -201,6 +212,14 @@ test_that("strftime", {
     times
   )
 
+  # with namespacing

Review Comment:
   I think that's what I was doing initially. I will shift back to that approach tomorrow. It would definitely help reduce the size of the test files, some of which are starting to get a bit large.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921532302


##########
r/vignettes/developers/bindings.Rmd:
##########
@@ -191,11 +191,11 @@ As `startsWith()` requires options, direct mapping is not appropriate.
 If the function cannot be mapped directly, some extra work may be needed to 
 ensure that calling the arrow version of the function results in the same result
 as calling the R version of the function.  In this case, the function will need 
-adding to the `nse_funcs` list in `arrow/r/R/dplyr-functions.R`.  Here is how 
-this might look for `startsWith()`:
+adding to the `nse_funcs` function register.  Here is how this might look for 

Review Comment:
   Done. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r875130895


##########
r/R/dplyr-funcs-utils.R:
##########
@@ -0,0 +1,24 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+register_bindings_utils <- function() {

Review Comment:
   Can this live next to `register_binding()` so that the namespacing logic stays together?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r874274771


##########
r/R/dplyr-mutate.R:
##########
@@ -25,6 +25,12 @@ mutate.arrow_dplyr_query <- function(.data,
                                      .after = NULL) {
   call <- match.call()
   exprs <- ensure_named_exprs(quos(...))
+  exprs2 <- exprs
+  # replace `::` with `_` in passed expressions
+  for (i in seq_along(exprs)) {
+    exprs[[i]][[2]] <- gsub("::", "_", rlang::expr_text(exprs[[i]][[2]]))
+    exprs[[i]][[2]] <- parse_expr(exprs[[i]][[2]])
+  }

Review Comment:
   Rather than syntax transformation, I think it would be cleaner to register a binding for `::` and use it:
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   library(dplyr, warn.conflicts = FALSE)
   
   # :: prefixes because of the current namespace-stripping behaviour
   arrow:::register_binding("::somepkg::some_fun", function(x) x * 2)
   arrow:::register_binding("::::", function(lhs, rhs) {
     lhs_name <- as.character(substitute(lhs))
     rhs_name <- as.character(substitute(rhs))
     arrow:::nse_funcs[[paste0(lhs_name, "::", rhs_name)]]
   })
   arrow:::create_binding_cache()
   
   arrow_table(a = 1) %>% mutate(b = somepkg::some_fun(a))
   #> Table (query)
   #> a: double
   #> b: double (multiply_checked(a, 2))
   #> 
   #> See $.data for the source Arrow object
   ```
   
   ...but make sure to check that this doesn't give absolutely garbage error messages like `Error in (function(...) {...})(arg1, arg2)` because of the inlined function.
   
   



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   When I added that I used the syntax of `vctrs::s3_register()`, which expects `"package::function"` as it's function argument:
   
   https://github.com/r-lib/vctrs/blob/e86bb9a4b57459eb56099c0b52af37877a9bf515/R/register-s3.R#L52-L55
   
   I'd anticipated that a separate PR would make the change from (e.g.) `register_binding("as_datetime", ...)` to `register_binding("lubridate::as_datetime", ...)`, and perhaps one like this would use the `namespace` variable. I'd envisioned something more complicated but I really like your idea of registering twice, which is much simpler. With the `::` separator it's a bit more robust, since nobody is going to type:
   
   ```r
   table %>% mutate(`lubridate::as_datetime`(something))
   ```
   
   ...by accident, but they *might* type `table %>% mutate(lubridate_as_datetime(something))` (perhaps not with lubridate, but with a shorter package name you could probably get a clash pretty easily, perhaps even between registered function names).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873083697


##########
r/R/dplyr-funcs.R:
##########
@@ -58,7 +58,8 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  # test name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)
 
   previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL

Review Comment:
   I need to add some logic to clean-up both names



##########
r/R/dplyr-mutate.R:
##########
@@ -57,7 +64,7 @@ mutate.arrow_dplyr_query <- function(.data,
     if (inherits(results[[new_var]], "try-error")) {
       msg <- handle_arrow_not_supported(
         results[[new_var]],
-        format_expr(exprs[[i]])
+        format_expr(exprs2[[i]])

Review Comment:
   we return the original expression in case of error, `"lubridate::as_datetime()"`, instead of `"lubridate_as_datime()"` to avoid confusion.



##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -303,7 +303,7 @@ register_bindings_datetime_conversion <- function() {
     )
   })
 
-  register_binding("as_datetime", function(x,
+  register_binding("lubridate_as_datetime", function(x,

Review Comment:
   only did this for 2 bindings: `as_datetime` and `make_difftime`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r875135616


##########
r/R/dplyr-funcs-utils.R:
##########
@@ -0,0 +1,24 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+register_bindings_utils <- function() {

Review Comment:
   Sure. It felt a bit weird to put it in a separate file, but I'm glad I'm not the only one thinking it might be best alongside `register_binding()`. Will move it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1133379666

   I think you're safe to make a user prefix `lubridate::date()` if they do run into this error (and you're also safe to do this in the test, since it's what I'd expect a user to do in the case of an ambiguous function name).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921504256


##########
r/tests/testthat/test-dplyr-filter.R:
##########
@@ -239,6 +239,14 @@ test_that("filter() with between()", {
       filter(between(chr, 1, 2)) %>%
       collect()
   )
+
+  # with namespacing
+  expect_error(

Review Comment:
   Should this assert the error message? As in, this test would pass on master because it would error, but the error message would be different right?



##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -201,6 +212,14 @@ test_that("strftime", {
     times
   )
 
+  # with namespacing

Review Comment:
   What if, instead of duplicating the whole test, you just added a column to the mutate(), like:
   
   ```
     compare_dplyr_binding(
       .input %>%
         mutate(
           x = strftime(datetime, format = formats),
           x2 = base::strftime(datetime, format = formats),
         ) %>%
         collect(),
       times
     )
   ```



##########
r/tests/testthat/test-dplyr-funcs-conditional.R:
##########
@@ -192,6 +212,25 @@ test_that("case_when()", {
     tbl
   )
 
+  # with namespacing
+  compare_dplyr_binding(
+    .input %>%
+      transmute(cw = dplyr::case_when(chr %in% letters[1:3] ~ 1L) + 41L) %>%
+      collect(),
+    tbl
+  )
+
+  compare_dplyr_binding(

Review Comment:
   Why do we need more than one test that you can call the namespaced version?



##########
r/vignettes/developers/bindings.Rmd:
##########
@@ -191,11 +191,11 @@ As `startsWith()` requires options, direct mapping is not appropriate.
 If the function cannot be mapped directly, some extra work may be needed to 
 ensure that calling the arrow version of the function results in the same result
 as calling the R version of the function.  In this case, the function will need 
-adding to the `nse_funcs` list in `arrow/r/R/dplyr-functions.R`.  Here is how 
-this might look for `startsWith()`:
+adding to the `nse_funcs` function register.  Here is how this might look for 

Review Comment:
   ```suggestion
   adding to the `nse_funcs` function registry.  Here is how this might look for 
   ```



##########
r/tests/testthat/test-util.R:
##########
@@ -39,3 +39,34 @@ test_that("as_writable_table() errors for invalid input", {
   # make sure other errors make it through
   expect_snapshot_error(wrapper_fun(data.frame(x = I(list(1, "a")))))
 })
+
+test_that("all_funs() identifies namespace-qualified and unqualified functions", {
+  expect_equal(
+    arrow:::all_funs(rlang::quo(pkg::fun())),

Review Comment:
   You don't need `arrow:::`, testthat runs in the package namespace
   
   ```suggestion
       all_funs(rlang::quo(pkg::fun())),
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) registry[[unqualified_name]] else NULL

Review Comment:
   Isn't this just?
   
   ```suggestion
     previous_fun <- registry[[unqualified_name]]
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name

Review Comment:
   I think you can simplify this function if you just deal with `::` separately. (I'll suggest below.) There's a lot of special casing that you could drop.



##########
r/R/dplyr-funcs.R:
##########
@@ -116,3 +136,18 @@ create_binding_cache <- function() {
 nse_funcs <- new.env(parent = emptyenv())
 agg_funcs <- new.env(parent = emptyenv())
 .cache <- new.env(parent = emptyenv())
+
+# we register 2 version of the "::" binding - one for use with nse_funcs (below)
+# and another one for use with agg_funcs (in dplyr-summarize.R)
+register_bindings_utils <- function() {
+  register_binding("::", function(lhs, rhs) {
+    lhs_name <- as.character(substitute(lhs))
+    rhs_name <- as.character(substitute(rhs))
+
+    fun_name <- paste0(lhs_name, "::", rhs_name)
+
+    # if we do not have a binding for pkg::fun, then fall back on to the
+    # regular pkg::fun function
+    nse_funcs[[fun_name]] %||% asNamespace(lhs_name)[[rhs_name]]
+  })

Review Comment:
   See above comment
   
   ```suggestion
     nse_funcs[["::"] <- function(lhs, rhs) {
       lhs_name <- as.character(substitute(lhs))
       rhs_name <- as.character(substitute(rhs))
   
       fun_name <- paste0(lhs_name, "::", rhs_name)
   
       # if we do not have a binding for pkg::fun, then fall back on to the
       # regular pkg::fun function
       nse_funcs[[fun_name]] %||% asNamespace(lhs_name)[[rhs_name]]
     }
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) registry[[unqualified_name]] else NULL
+
+  # if th unqualified name exists in the register, warn
+  if (!is.null(fun) && !is.null(previous_fun)) {
+    warn(
+      paste0(
+        "A \"",
+        unqualified_name,
+        "\" binding already exists in the register and will be overwritten.")
+    )
+  }
 
+  # if fun is NULL remove entries from the function registry

Review Comment:
   Is this a real use case? Do we test this?



##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -937,15 +1175,9 @@ test_that("date works in arrow", {
 
   r_date_object <- lubridate::ymd_hms("2012-03-26 23:12:13")
 
-  # we can't (for now) use namespacing, so we need to make sure lubridate::date()
-  # and not base::date() is being used. This is due to the way testthat runs and
-  # normal use of arrow would not have to do this explicitly.
-  # TODO: remove after ARROW-14575

Review Comment:
   🎉 



##########
r/R/dplyr-summarize.R:
##########
@@ -348,7 +362,7 @@ summarize_eval <- function(name, quosure, ctx, hash) {
   # the list output from the Arrow hash_tdigest kernel to flatten it into a
   # column of type float64. We do that by modifying the unevaluated expression
   # to replace quantile(...) with arrow_list_element(quantile(...), 0L)
-  if (hash && "quantile" %in% funs_in_expr) {
+  if (hash && ("quantile" %in% funs_in_expr || "stats::quantile" %in% funs_in_expr)) {

Review Comment:
   ```suggestion
     if (hash && any(c("quantile", "stats::quantile") %in% funs_in_expr)) {
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) registry[[unqualified_name]] else NULL
+
+  # if th unqualified name exists in the register, warn

Review Comment:
   ```suggestion
     # if the unqualified name exists in the register, warn
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914927871


##########
r/R/arrow-datum.R:
##########
@@ -103,10 +103,10 @@ Ops.ArrowDatum <- function(e1, e2) {
 #' @export
 Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
   switch(.Generic,
-    abs = ,
-    sign = ,
-    floor = ,
-    ceiling = ,
+    abs = eval_array_expression("abs_checked", x),
+    sign = eval_array_expression("sign", x),
+    floor = eval_array_expression("floor", x),
+    ceiling = eval_array_expression("ceil", x),

Review Comment:
   There are function in the `.unary_function_map` we definitely want to be able to use the prefixed version for, such as `lubridate::year()` or `stringr::str_length()`. The prefixed version might not entirely be necessary for the {base} functions, e.g. `base::abs()`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921936356


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -201,6 +212,14 @@ test_that("strftime", {
     times
   )
 
+  # with namespacing

Review Comment:
   I went towards testing the namespaced version of every binding (not just for the special cases) as I thought it would be more thorough if we made sure the approach works in general, but also there is at least one test for each of the bindings (just in case some of them might require some particular treatment). Having exhaustive unit tests allowed me to identify the special cases (such as quantile, for example).  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922147880


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,27 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name

Review Comment:
   Suggestion to change the name was part of an earlier review: https://github.com/apache/arrow/pull/13160#discussion_r914801827



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916165829


##########
r/R/arrow-datum.R:
##########
@@ -103,10 +103,10 @@ Ops.ArrowDatum <- function(e1, e2) {
 #' @export
 Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
   switch(.Generic,
-    abs = ,
-    sign = ,
-    floor = ,
-    ceiling = ,
+    abs = eval_array_expression("abs_checked", x),
+    sign = eval_array_expression("sign", x),
+    floor = eval_array_expression("floor", x),
+    ceiling = eval_array_expression("ceil", x),

Review Comment:
   ```suggestion
       ceiling = eval_array_expression("ceil", x),
       sign = ,
       floor = ,
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916839256


##########
r/R/util.R:
##########
@@ -93,6 +93,13 @@ all_funs <- function(expr) {
     expr <- quo_get_expr(expr)
   }
   names <- all.names(expr)
+  # if we have namespace-qualified functions rebuild the function name with the
+  # pkg:: prefix
+  for (i in seq_along(names)) {
+    if (names[i] == "::") {
+      names[i] <- paste0(names[i+1], names[i], names[i+2])
+    }
+  }

Review Comment:
   Yep, adding tests is on the to-do list. I will mark this as draft while I add tests and address all outstanding comments.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873084594


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -1600,17 +1600,20 @@ test_that("`as_datetime()`", {
     double_date = c(10.1, 25.2, NA)
   )
 
-  test_df %>%
-    arrow_table() %>%
-    mutate(
-      ddate = as_datetime(date),
-      dchar_date_no_tz = as_datetime(char_date),
-      dchar_date_non_iso = as_datetime(char_date_non_iso, format = "%Y-%d-%m %H:%M:%S"),
-      dint_date = as_datetime(int_date, origin = "1970-01-02"),
-      dintegerish_date = as_datetime(integerish_date, origin = "1970-01-02"),
-      dintegerish_date2 = as_datetime(integerish_date, origin = "1970-01-01")
-    ) %>%
-    collect()
+  compare_dplyr_binding(

Review Comment:
   I expect some tests will fail, mostly due to the discarding of the characters before the `"_"` during the registration of the binding. If we switch to the `"package_fun_name"` naming convention, that _should_ 🤞🏻be fine. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jonkeane commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
jonkeane commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r874087967


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   Nic is absolutely right here, let's think about the design here. Especially using underscores, which are not reserved so could cause clashes | weird design decisions like they are pointing out. 
   
   You might also look at the PR that added much of this code — there's discussion in there that is super helpful: https://github.com/apache/arrow/pull/11904#issuecomment-999258101 I'm sure you've read it as well, but there's some really good content in the jira comments as well (along with some sketches). 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873665067


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   The `namespace` variable existed there, I haven't touched it. I assume it was added for potential future use. I think it could be removed once we support the `package::` prefix. We could introduce an `arrow` specific prefix. This could be either `arrow_` or `internal_`. In which case both `arrow_fun_name` and `fun_name` would be registered.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873690761


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   I went for what I see as a simpler approach as it's easier to think about and evaluate an expression when it doesn't contain the `::` operator. Do you see any shortcomings to my proposed approach? (other than the need for "namespacing" internal / arrow functions) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873690761


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   I went for what I see as a simpler approach as it's easier to think about and execute an expression when it doesn't contain the `::` operator. Do you see any shortcomings to my proposed approach?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1126824453

   https://issues.apache.org/jira/browse/ARROW-14575


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
ursabot commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1186025202

   Benchmark runs are scheduled for baseline = 29cc263068b983e690879d4d768025439a0fdd47 and contender = 3e0eea1244a066a6aee3262440093df021c37882. 3e0eea1244a066a6aee3262440093df021c37882 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/9841bda21b61485e92ed31889886b639...3dde44e432fb48cd8e1cfd5dfd019a80/)
   [Failed :arrow_down:0.38% :arrow_up:0.03%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/c274a0da0285414293490d171b62049a...f3360a7e93b943378ec99cde038bb55c/)
   [Failed :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/d138187e31674f54971905ba77ab0b43...57826bd50fd942a8bec6bf28d09f7aee/)
   [Finished :arrow_down:0.14% :arrow_up:0.04%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/30b727eeb27d4e6f8a294e1df4e2cc34...f6a5dab8d5af425e8b9fac8ede887402/)
   Buildkite builds:
   [Finished] [`3e0eea12` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/1123)
   [Failed] [`3e0eea12` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/1127)
   [Failed] [`3e0eea12` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/1115)
   [Finished] [`3e0eea12` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/1131)
   [Finished] [`29cc2630` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/1122)
   [Failed] [`29cc2630` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/1126)
   [Failed] [`29cc2630` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/1114)
   [Finished] [`29cc2630` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/1130)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922133524


##########
r/tests/testthat/test-dplyr-mutate.R:
##########
@@ -140,17 +140,6 @@ test_that("transmute() with unsupported arguments", {
   )
 })
 
-test_that("transmute() defuses dots arguments (ARROW-13262)", {

Review Comment:
   I don't think we need it. We solved the underlying issue (transmute not defusing the ellipsis arg). It we want the test in, we need a function for which the fallback to R works, so it can't be a made up one. We do not have a corresponding test for `mutate()`, for example. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921544627


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name

Review Comment:
   Updated the definition of `register_binding()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r917020009


##########
r/R/util.R:
##########
@@ -93,6 +93,13 @@ all_funs <- function(expr) {
     expr <- quo_get_expr(expr)
   }
   names <- all.names(expr)
+  # if we have namespace-qualified functions rebuild the function name with the
+  # pkg:: prefix
+  for (i in seq_along(names)) {
+    if (names[i] == "::") {
+      names[i] <- paste0(names[i+1], names[i], names[i+2])
+    }
+  }

Review Comment:
   Done. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914927871


##########
r/R/arrow-datum.R:
##########
@@ -103,10 +103,10 @@ Ops.ArrowDatum <- function(e1, e2) {
 #' @export
 Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
   switch(.Generic,
-    abs = ,
-    sign = ,
-    floor = ,
-    ceiling = ,
+    abs = eval_array_expression("abs_checked", x),
+    sign = eval_array_expression("sign", x),
+    floor = eval_array_expression("floor", x),
+    ceiling = eval_array_expression("ceil", x),

Review Comment:
   There are function in the `.unary_function_map` we definitely want to be able to use the prefixed version, such as `lubridate::year()` or `stringr::str_length()`. The prefixed version might not entirely be necessary for the {base} functions, e.g. `base::abs()`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916169234


##########
r/tests/testthat/test-dplyr-funcs-conditional.R:
##########
@@ -29,7 +29,7 @@ test_that("if_else and ifelse", {
   compare_dplyr_binding(
     .input %>%
       mutate(
-        y = if_else(int > 5, 1, 0)
+        y = dplyr::if_else(int > 5, 1, 0)

Review Comment:
   Could you remove all of the `pkg::` prefixes that are not part of a dedicated test?



##########
r/tests/testthat/test-dataset-dplyr.R:
##########
@@ -119,7 +119,7 @@ test_that("filter() on date32 columns", {
   # Also with timestamp scalar
   expect_equal(
     open_dataset(tmp) %>%
-      filter(date > lubridate::ymd_hms("2020-02-02 00:00:00")) %>%
+      filter(date > lubridate::ymd_hms("2020-02-02 00:00:00", tz = "UTC")) %>%

Review Comment:
   Is this change related to this PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1174900950

   I think this is ready for another look, with the following caveats:
   * I still need to write a short paragraph / sentence for the _Writing bindings_ article indicating bindings should be registered with `"pkg::fun"` name. 
   * using the `pkg::` prefix doesn't yet work inside `summarise()` (i.e. with aggregating functions)
   * there is a second approach to achieving the same goal, which can be seen here (https://github.com/dragosmg/arrow/tree/allow_binding_registration_with_pkg_prefix)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922204326


##########
r/tests/testthat/test-dplyr-mutate.R:
##########
@@ -140,17 +140,6 @@ test_that("transmute() with unsupported arguments", {
   )
 })
 
-test_that("transmute() defuses dots arguments (ARROW-13262)", {

Review Comment:
   I used a call to `stringr::str_squish()` (which I don't think we will have bindings for anytime soon), so this is, technically, resolved.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922141453


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,27 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  unqualified_name <- gsub("^.*?::", "", qualified_name)

Review Comment:
   the greedy matching with `:+` doesn't seem to be working (I get your point, I will look into it a bit):
   ``` r
   fun_name <- "somePkg::some_fun"
   
   gsub("^.*?::", "", fun_name)
   #> [1] "some_fun"
   sub("^.*?:+", "", fun_name)
   #> [1] ":some_fun"
   ```
   
   <sup>Created on 2022-07-15 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup> 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1185795881

   @github-actions crossbow submit test-r-minimal-build test-r-versions


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922108970


##########
r/tests/testthat/test-dplyr-funcs-string.R:
##########
@@ -23,6 +23,14 @@ library(lubridate)
 library(stringr)
 library(stringi)
 
+tbl <- example_data
+# Add some better string data
+tbl$verses <- verses[[1]]
+# c(" a ", "  b  ", "   c   ", ...) increasing padding
+# nchar =   3  5  7  9 11 13 15 17 19 21
+tbl$padded_strings <- stringr::str_pad(letters[1:10], width = 2 * (1:10) + 1, side = "both")
+tbl$some_grouping <- rep(c(1, 2), 5)
+

Review Comment:
   Why is this needed? Isn't this a copy of helper-data.R?



##########
r/tests/testthat/test-dplyr-mutate.R:
##########
@@ -140,17 +140,6 @@ test_that("transmute() with unsupported arguments", {
   )
 })
 
-test_that("transmute() defuses dots arguments (ARROW-13262)", {

Review Comment:
   We should keep this test, right? Just change the function to something made up maybe.



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,27 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name

Review Comment:
   Why not just keep it as `fun_name`? It's kinda odd when the first line of a function just renames the function argument.



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,27 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  unqualified_name <- gsub("^.*?::", "", qualified_name)
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- registry[[unqualified_name]]
 
-  if (is.null(fun) && !is.null(previous_fun)) {
-    rm(list = name, envir = registry, inherits = FALSE)
+  # if the unqualified name exists in the registry, warn
+  if (!is.null(fun) && !is.null(previous_fun)) {
+    warn(
+      paste0(
+        "A \"",
+        unqualified_name,
+        "\" binding already exists in the registry and will be overwritten.")
+    )
+  }
+
+  # register both as `pkg::fun` and as `fun` if `qualified_name` is prefixed
+  if (grepl("::", qualified_name) && qualified_name != "::") {

Review Comment:
   ```suggestion
     if (grepl("::", qualified_name)) {
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,27 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  unqualified_name <- gsub("^.*?::", "", qualified_name)

Review Comment:
   Don't need global substitution, and use `:+` to remove both `::` and `:::`
   
   ```suggestion
     unqualified_name <- sub("^.*?:+", "", fun_name)
   ```



##########
r/tests/testthat/test-dplyr-summarize.R:
##########
@@ -459,6 +482,35 @@ test_that("quantile()", {
   )
 })
 
+test_that("quantile() with namespacing", {
+  suppressWarnings(
+    expect_warning(
+      expect_equal(
+        tbl %>%
+          group_by(some_grouping) %>%
+          summarize(
+            q_dbl = quantile(dbl, probs = 0.5, na.rm = TRUE, names = FALSE),
+            q_int = as.double(
+              quantile(int, probs = 0.5, na.rm = TRUE, names = FALSE)
+            )
+          ) %>%
+          arrange(some_grouping),
+        Table$create(tbl) %>%
+          group_by(some_grouping) %>%
+          summarize(
+            q_dbl = stats::quantile(dbl, probs = 0.5, na.rm = TRUE),

Review Comment:
   Now this is a test you should do with separate namespaced and not versions so you can test that both raise the warning.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873690761


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   I went for what I see as a simpler approach as it's easier to think about and execute an expression when it doesn't contain the `::` operator.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873665067


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   * The `namespace` variable existed previously, I haven't touched it. I assume it was added for potential future use. I think it could be removed once we support the `package::` prefix. 
   * We could introduce an `arrow` specific prefix. This could be either `arrow_` or `internal_`. In which case both `arrow_fun_name` and `fun_name` would be registered.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916164829


##########
r/R/arrow-datum.R:
##########
@@ -103,10 +103,10 @@ Ops.ArrowDatum <- function(e1, e2) {
 #' @export
 Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
   switch(.Generic,
-    abs = ,
-    sign = ,
-    floor = ,
-    ceiling = ,
+    abs = eval_array_expression("abs_checked", x),
+    sign = eval_array_expression("sign", x),
+    floor = eval_array_expression("floor", x),
+    ceiling = eval_array_expression("ceil", x),

Review Comment:
   I understand how we need to register the functions with their namespaces...you're right that keeping the `.unary_function_map` functions namespaced is the way to go. I think what tripped me up is that `abs()` is the only one here that needs special handling (the others, to my reading, can just fall through). I'll leave a code suggestion with what I mean here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916161418


##########
r/R/dplyr-funcs.R:
##########
@@ -58,13 +58,20 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  if (fun_name == "::") {

Review Comment:
   You're totally right (but change the names regardless so that I don't make that mistake again!)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916893690


##########
r/tests/testthat/test-dplyr-summarize.R:
##########
@@ -999,3 +999,12 @@ test_that("summarise() can handle scalars and literal values", {
     tibble(y = 2L)
   )
 })
+

Review Comment:
   Good shout! They didn't, but I have a fix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914903365


##########
r/R/dplyr-funcs.R:
##########
@@ -116,3 +124,11 @@ create_binding_cache <- function() {
 nse_funcs <- new.env(parent = emptyenv())
 agg_funcs <- new.env(parent = emptyenv())
 .cache <- new.env(parent = emptyenv())
+
+register_bindings_utils <- function() {
+  register_binding("::", function(lhs, rhs) {
+    lhs_name <- as.character(substitute(lhs))
+    rhs_name <- as.character(substitute(rhs))
+    nse_funcs[[paste0(lhs_name, "::", rhs_name)]]

Review Comment:
   Thanks. Makes sense (I was thinking somewhat along the same lines, but wasn't aware `asNamespace()` existed).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914864833


##########
r/R/arrow-datum.R:
##########
@@ -103,10 +103,10 @@ Ops.ArrowDatum <- function(e1, e2) {
 #' @export
 Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
   switch(.Generic,
-    abs = ,
-    sign = ,
-    floor = ,
-    ceiling = ,
+    abs = eval_array_expression("abs_checked", x),
+    sign = eval_array_expression("sign", x),
+    floor = eval_array_expression("floor", x),
+    ceiling = eval_array_expression("ceil", x),

Review Comment:
   For example, for `abs`, I think the `Math.ArrowDatum` fails because it cannot find it, since it's been renamed to `base::abs()` in the `.unary_function_map`. This ☝🏻 was my work around to make sure `abs` is mapped to `abs_checked`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921547337


##########
r/tests/testthat/test-dplyr-funcs-conditional.R:
##########
@@ -192,6 +212,25 @@ test_that("case_when()", {
     tbl
   )
 
+  # with namespacing
+  compare_dplyr_binding(
+    .input %>%
+      transmute(cw = dplyr::case_when(chr %in% letters[1:3] ~ 1L) + 41L) %>%
+      collect(),
+    tbl
+  )
+
+  compare_dplyr_binding(

Review Comment:
   Nope, I'll keep the `filter()` test since that one is a bit more exotic. 🍍  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873084594


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -1600,17 +1600,20 @@ test_that("`as_datetime()`", {
     double_date = c(10.1, 25.2, NA)
   )
 
-  test_df %>%
-    arrow_table() %>%
-    mutate(
-      ddate = as_datetime(date),
-      dchar_date_no_tz = as_datetime(char_date),
-      dchar_date_non_iso = as_datetime(char_date_non_iso, format = "%Y-%d-%m %H:%M:%S"),
-      dint_date = as_datetime(int_date, origin = "1970-01-02"),
-      dintegerish_date = as_datetime(integerish_date, origin = "1970-01-02"),
-      dintegerish_date2 = as_datetime(integerish_date, origin = "1970-01-01")
-    ) %>%
-    collect()
+  compare_dplyr_binding(

Review Comment:
   I expect some tests will fail, mostly due to the discarding of the characters before the `"_"` during the registration of the binding. If we switch to the `"package_fun_name"` naming convention, that should be fine. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1129177638

   > Looking good...I imagine the next step is to add the function prefixes (including `base::`) to all the `regsiter_binding()` calls and add some tests (maybe one per package?) to make sure that when somebody does call `lubridate::as_datetime()` that it works as we expect.
   
   Yep, that's the plan. There are some tests failing - I need to deal with that too. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914923372


##########
r/R/dplyr-funcs.R:
##########
@@ -58,13 +58,20 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  if (fun_name == "::") {

Review Comment:
   `name` here is `"fun"` and `fun_name` is `"pkg::fun"`. In this context isn't, technically, `fun_name` the namespace qualified one?  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914903365


##########
r/R/dplyr-funcs.R:
##########
@@ -116,3 +124,11 @@ create_binding_cache <- function() {
 nse_funcs <- new.env(parent = emptyenv())
 agg_funcs <- new.env(parent = emptyenv())
 .cache <- new.env(parent = emptyenv())
+
+register_bindings_utils <- function() {
+  register_binding("::", function(lhs, rhs) {
+    lhs_name <- as.character(substitute(lhs))
+    rhs_name <- as.character(substitute(rhs))
+    nse_funcs[[paste0(lhs_name, "::", rhs_name)]]

Review Comment:
   Thanks. Makes sense (I was thinking somewhat along the same lines, but wasn't aware `asNamespace()` existed). Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916828419


##########
r/tests/testthat/test-dplyr-summarize.R:
##########
@@ -999,3 +999,12 @@ test_that("summarise() can handle scalars and literal values", {
     tibble(y = 2L)
   )
 })
+

Review Comment:
   Also add a test for a nested namespaced thing like `sum(base::log(x))` or `base::sum(x) + 1` (I'm sure they work but I seem to remember that the nesting added some complexity to the summarise evaluation).



##########
r/R/util.R:
##########
@@ -93,6 +93,13 @@ all_funs <- function(expr) {
     expr <- quo_get_expr(expr)
   }
   names <- all.names(expr)
+  # if we have namespace-qualified functions rebuild the function name with the
+  # pkg:: prefix
+  for (i in seq_along(names)) {
+    if (names[i] == "::") {
+      names[i] <- paste0(names[i+1], names[i], names[i+2])
+    }
+  }

Review Comment:
   This is very cool and simpler than what I had been thinking!
   
   Could you add tests (maybe in test-util.R) to make sure this works? (It does! But the tests would have helped me make sure of that without having to pull the PR and check). These are the pieces I checked but maybe a few variants without namespaced functions would be good too.
   
   ``` r
   arrow:::all_funs(rlang::quo(pkg::fun()))
   #> [1] "pkg::fun"
   arrow:::all_funs(rlang::quo(pkg::fun(other_pkg::obj)))
   #> [1] "pkg::fun"
   arrow:::all_funs(rlang::quo(other_fun(pkg::fun())))
   #> [1] "other_fun" "pkg::fun"
   arrow:::all_funs(rlang::quo(other_pkg::other_fun(pkg::fun())))
   #> [1] "other_pkg::other_fun" "pkg::fun"
   ```
   
   <sup>Created on 2022-07-08 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921529807


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -201,6 +212,14 @@ test_that("strftime", {
     times
   )
 
+  # with namespacing

Review Comment:
   I think that's what I was doing initially. I could shift back to that approach. I would definitely reduce the size of some of the test files, some of which are starting to get a bit large.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921922502


##########
r/R/dplyr-summarize.R:
##########
@@ -161,6 +161,21 @@ register_bindings_aggregate <- function() {
   })
 }
 
+# we register 2 version of the "::" binding - one for use with nse_funcs
+# and another one for use with agg_funcs (below)

Review Comment:
   I think what I had in mind was the location of the registration and not that of the use. I was trying to be helpful and answer the question: "Why are there 2 different bindings for `"::"`? I expanded a bit on those 2 comments, let me know if you think it could be improved



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922173987


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,27 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  unqualified_name <- gsub("^.*?::", "", qualified_name)

Review Comment:
   it looks like `+` needs to be surrounded by curly brackets 🙃 
   ``` r
   sub("^.*?:{+}", "", "somePkg::some_fun")
   #> [1] "some_fun"
   sub("^.*?:{+}", "", "somePkg:::some_fun")
   #> [1] "some_fun"
   ```
   
   <sup>Created on 2022-07-15 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921536397


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) registry[[unqualified_name]] else NULL
+
+  # if th unqualified name exists in the register, warn
+  if (!is.null(fun) && !is.null(previous_fun)) {
+    warn(
+      paste0(
+        "A \"",
+        unqualified_name,
+        "\" binding already exists in the register and will be overwritten.")
+    )
+  }
 
+  # if fun is NULL remove entries from the function registry

Review Comment:
   Not 💯  convinced it is a real use case. We test it [here](https://github.com/dragosmg/arrow/blob/eb5d35a735bb97e706e6c4a0f0e7f6f06bc3cb7f/r/tests/testthat/test-dplyr-funcs.R#L27-L28).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921537478


##########
r/R/dplyr-summarize.R:
##########
@@ -348,7 +362,7 @@ summarize_eval <- function(name, quosure, ctx, hash) {
   # the list output from the Arrow hash_tdigest kernel to flatten it into a
   # column of type float64. We do that by modifying the unevaluated expression
   # to replace quantile(...) with arrow_list_element(quantile(...), 0L)
-  if (hash && "quantile" %in% funs_in_expr) {
+  if (hash && ("quantile" %in% funs_in_expr || "stats::quantile" %in% funs_in_expr)) {

Review Comment:
   Done. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922134654


##########
r/tests/testthat/test-dplyr-funcs-string.R:
##########
@@ -23,6 +23,14 @@ library(lubridate)
 library(stringr)
 library(stringi)
 
+tbl <- example_data
+# Add some better string data
+tbl$verses <- verses[[1]]
+# c(" a ", "  b  ", "   c   ", ...) increasing padding
+# nchar =   3  5  7  9 11 13 15 17 19 21
+tbl$padded_strings <- stringr::str_pad(letters[1:10], width = 2 * (1:10) + 1, side = "both")
+tbl$some_grouping <- rep(c(1, 2), 5)
+

Review Comment:
   True. Removed. I was using it interactively, but you are right. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873083746


##########
r/R/dplyr-mutate.R:
##########
@@ -57,7 +64,7 @@ mutate.arrow_dplyr_query <- function(.data,
     if (inherits(results[[new_var]], "try-error")) {
       msg <- handle_arrow_not_supported(
         results[[new_var]],
-        format_expr(exprs[[i]])
+        format_expr(exprs2[[i]])

Review Comment:
   In case or error, we return the original expression, e.g. `"lubridate::as_datetime()"` instead of `"lubridate_as_datime()"` to avoid confusion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1126824457

   :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916839256


##########
r/R/util.R:
##########
@@ -93,6 +93,13 @@ all_funs <- function(expr) {
     expr <- quo_get_expr(expr)
   }
   names <- all.names(expr)
+  # if we have namespace-qualified functions rebuild the function name with the
+  # pkg:: prefix
+  for (i in seq_along(names)) {
+    if (names[i] == "::") {
+      names[i] <- paste0(names[i+1], names[i], names[i+2])
+    }
+  }

Review Comment:
   Yep, adding tests is on the to-do list. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914866094


##########
r/tests/testthat/test-dataset-dplyr.R:
##########
@@ -73,7 +73,7 @@ test_that("filter() on timestamp columns", {
   ds <- open_dataset(dataset_dir, partitioning = schema(part = uint8()))
   expect_equal(
     ds %>%
-      filter(ts >= lubridate::ymd_hms("2015-05-04 03:12:39")) %>%
+      filter(ts >= lubridate::ymd_hms("2015-05-04 03:12:39", tz = "UTC")) %>%

Review Comment:
   Nope. Removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921545783


##########
r/R/dplyr-funcs.R:
##########
@@ -116,3 +136,18 @@ create_binding_cache <- function() {
 nse_funcs <- new.env(parent = emptyenv())
 agg_funcs <- new.env(parent = emptyenv())
 .cache <- new.env(parent = emptyenv())
+
+# we register 2 version of the "::" binding - one for use with nse_funcs (below)
+# and another one for use with agg_funcs (in dplyr-summarize.R)
+register_bindings_utils <- function() {
+  register_binding("::", function(lhs, rhs) {
+    lhs_name <- as.character(substitute(lhs))
+    rhs_name <- as.character(substitute(rhs))
+
+    fun_name <- paste0(lhs_name, "::", rhs_name)
+
+    # if we do not have a binding for pkg::fun, then fall back on to the
+    # regular pkg::fun function
+    nse_funcs[[fun_name]] %||% asNamespace(lhs_name)[[rhs_name]]
+  })

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921554773


##########
r/tests/testthat/test-dplyr-filter.R:
##########
@@ -239,6 +239,14 @@ test_that("filter() with between()", {
       filter(between(chr, 1, 2)) %>%
       collect()
   )
+
+  # with namespacing
+  expect_error(

Review Comment:
   There are a bunch of tests in that chunk that do not assert the error message since it's being surfaced from C++. I've changed the test to a non-error one, since that might be more relevant for our use case. Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922175265


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,27 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  unqualified_name <- gsub("^.*?::", "", qualified_name)

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1185773078

   @nealrichardson I hope I addressed all the feedback. I'm hoping I can maybe run the CI over the weekend if the r-hub image gets updated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921922502


##########
r/R/dplyr-summarize.R:
##########
@@ -161,6 +161,21 @@ register_bindings_aggregate <- function() {
   })
 }
 
+# we register 2 version of the "::" binding - one for use with nse_funcs
+# and another one for use with agg_funcs (below)

Review Comment:
   I think what I had in mind was the location of the registration and not that of the use. I was trying to be helpful and answer the question: "Why are there 2 different bindings for `"::"`? I expanded a bit on those 2 comments. I will mark it as resolved, but let me know if you think it could be improved.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922133524


##########
r/tests/testthat/test-dplyr-mutate.R:
##########
@@ -140,17 +140,6 @@ test_that("transmute() with unsupported arguments", {
   )
 })
 
-test_that("transmute() defuses dots arguments (ARROW-13262)", {

Review Comment:
   I don't think we need it. We solved the underlying issue (transmute not defusing the ellipsis arg). If we want the test in, we need a function for which the fallback to R works, so it can't be a made up one. We do not have a corresponding test for `mutate()`, for example. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson merged pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
nealrichardson merged PR #13160:
URL: https://github.com/apache/arrow/pull/13160


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921545398


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) registry[[unqualified_name]] else NULL

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1185798273

   Revision: ac316ee0d748a73114ff9166edd3c045d1c210da
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-6a3852ff25](https://github.com/ursacomputing/crossbow/branches/all?query=actions-6a3852ff25)
   
   |Task|Status|
   |----|------|
   |test-r-minimal-build|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-6a3852ff25-azure-test-r-minimal-build)](https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-6a3852ff25-azure-test-r-minimal-build)|
   |test-r-versions|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-6a3852ff25-github-test-r-versions)](https://github.com/ursacomputing/crossbow/actions?query=branch:actions-6a3852ff25-github-test-r-versions)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1184490876

   @nealrichardson would you mind taking a look? 🙏🏻 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921529807


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -201,6 +212,14 @@ test_that("strftime", {
     times
   )
 
+  # with namespacing

Review Comment:
   I think that's what I was doing initially. I could shift back to that approach. It would definitely reduce the size of the test files, some of which are starting to get a bit large.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921938202


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -201,6 +212,14 @@ test_that("strftime", {
     times
   )
 
+  # with namespacing

Review Comment:
   I have now reduced the size of the unit test files (but keeping the same aim of testing at least once each namespaces version) and will mark this as resolved. Feel free to unresolve it if necessary.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r917087926


##########
r/R/arrow-datum.R:
##########
@@ -103,10 +103,10 @@ Ops.ArrowDatum <- function(e1, e2) {
 #' @export
 Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
   switch(.Generic,
-    abs = ,
-    sign = ,
-    floor = ,
-    ceiling = ,
+    abs = eval_array_expression("abs_checked", x),
+    sign = eval_array_expression("sign", x),
+    floor = eval_array_expression("floor", x),
+    ceiling = eval_array_expression("ceil", x),

Review Comment:
   Done.



##########
r/R/arrow-datum.R:
##########
@@ -103,10 +103,10 @@ Ops.ArrowDatum <- function(e1, e2) {
 #' @export
 Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
   switch(.Generic,
-    abs = ,
-    sign = ,
-    floor = ,
-    ceiling = ,
+    abs = eval_array_expression("abs_checked", x),
+    sign = eval_array_expression("sign", x),
+    floor = eval_array_expression("floor", x),
+    ceiling = eval_array_expression("ceil", x),

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916208642


##########
r/tests/testthat/test-dplyr-funcs-conditional.R:
##########
@@ -29,7 +29,7 @@ test_that("if_else and ifelse", {
   compare_dplyr_binding(
     .input %>%
       mutate(
-        y = if_else(int > 5, 1, 0)
+        y = dplyr::if_else(int > 5, 1, 0)

Review Comment:
   Sure. I was aiming to have at least one test per binding in which we use the namespace-qualified version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873083672


##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -303,7 +303,7 @@ register_bindings_datetime_conversion <- function() {
     )
   })
 
-  register_binding("as_datetime", function(x,
+  register_binding("lubridate_as_datetime", function(x,

Review Comment:
   I only did this as a test for 2 bindings: `as_datetime` and `make_difftime`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873669800


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   FWIW I think a binding name `"lubridate_date"` is better than `"date"` as it makes it clear which function we're aiming to intercept / replace in a mutate call.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873690761


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   I went for what I see as a simpler approach as it's easier to think about and execute an expression when it doesn't contain the `::` operator. Do you see any shortcomings to my proposed approach? (other than the need for "namespacing" internal / arrow functions?) 



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   I went for what I see as a simpler approach as it's easier to think about and execute an expression when it doesn't contain the `::` operator. Do you see any shortcomings to my proposed approach? (other than the need for "namespacing" internal / arrow functions) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1133093901

   outstanding issues:
   - [ ] namespacing does work properly for `lubridate::date` (`base::date()` is being picked up when not prefixed) - see CI failure https://github.com/apache/arrow/runs/6526882948?check_suite_focus=true
   - [ ] prefixing doesn't work for summarising / aggregating functions - I haven't finished investigating yet.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921545526


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) registry[[unqualified_name]] else NULL
+
+  # if th unqualified name exists in the register, warn

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922154083


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,27 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name

Review Comment:
   dropped `qualified_name`, so will keep `fun_name` and `unqualified_name` going forward as they are not confusing. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1181680197

   Checklist from Dewey's [comment](https://github.com/apache/arrow/pull/13160#pullrequestreview-1032002888): 
   - [ ] Remove all the `pkg::` prefixes for tests that aren't testing namepacing. There are still a few dozen of these that I see in the diff.
   - [ ] Update the names in `register_bindings()` to make it easier for the code reader to know which one is the qualified name and which one is not
   - [ ] Fix the Math group generic thing (I left a minimal suggestion that should do it)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921922502


##########
r/R/dplyr-summarize.R:
##########
@@ -161,6 +161,21 @@ register_bindings_aggregate <- function() {
   })
 }
 
+# we register 2 version of the "::" binding - one for use with nse_funcs
+# and another one for use with agg_funcs (below)

Review Comment:
   I think what I had in mind was the location of the registration and not that of the use. I was trying to be helpful and answer the question: "Why are there 2 different bindings for `"::"`? I expanded a bit on those 2 comments, let me know if you think it could be improve.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921947023


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) registry[[unqualified_name]] else NULL
+
+  # if th unqualified name exists in the register, warn
+  if (!is.null(fun) && !is.null(previous_fun)) {
+    warn(
+      paste0(
+        "A \"",
+        unqualified_name,
+        "\" binding already exists in the register and will be overwritten.")
+    )
+  }
 
+  # if fun is NULL remove entries from the function registry

Review Comment:
   I plan on working on [ARROW-14071: [R] Try to arrow_eval user-defined functions](https://issues.apache.org/jira/browse/ARROW-14071) next, so the ability to remove a binding could be useful there, but it might not be. Since it doesn't play a role in this PR, I think we should remove it and add it back in if needed. I will remove it and mark this as resolved. Let me know if you think otherwise.  
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922123080


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,27 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  unqualified_name <- gsub("^.*?::", "", qualified_name)
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- registry[[unqualified_name]]
 
-  if (is.null(fun) && !is.null(previous_fun)) {
-    rm(list = name, envir = registry, inherits = FALSE)
+  # if the unqualified name exists in the registry, warn
+  if (!is.null(fun) && !is.null(previous_fun)) {
+    warn(
+      paste0(
+        "A \"",
+        unqualified_name,
+        "\" binding already exists in the registry and will be overwritten.")
+    )
+  }
+
+  # register both as `pkg::fun` and as `fun` if `qualified_name` is prefixed
+  if (grepl("::", qualified_name) && qualified_name != "::") {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922244242


##########
r/tests/testthat/test-dplyr-funcs-string.R:
##########
@@ -23,6 +23,14 @@ library(lubridate)
 library(stringr)
 library(stringi)
 
+tbl <- example_data
+# Add some better string data
+tbl$verses <- verses[[1]]
+# c(" a ", "  b  ", "   c   ", ...) increasing padding
+# nchar =   3  5  7  9 11 13 15 17 19 21
+tbl$padded_strings <- stringr::str_pad(letters[1:10], width = 2 * (1:10) + 1, side = "both")
+tbl$some_grouping <- rep(c(1, 2), 5)
+

Review Comment:
   It turns out it wasn't a copy of helper-data (I knew I had it there for a reason) 😬 . 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873083697


##########
r/R/dplyr-funcs.R:
##########
@@ -58,7 +58,8 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  # test name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)
 
   previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL

Review Comment:
   I need to add some logic to clean-up / un-register both names.



##########
r/R/dplyr-funcs.R:
##########
@@ -58,7 +58,8 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  # test name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)
 
   previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1128969830

   > Given that currently we use a "single-bucket" approach, I'd keep it that way in the short term, unless there's a specific reason to change it? We can update it later if necessary.
   
   Yaya, that's what I'm working on. I went for the mixed bucket approach yesterday evening and earlier today and got really messy really quickly.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1128973087

   I quite liked your suggestion of the single bucket but registering twice (i.e., do `nse_funcs[["fun_name"]] <- fun` but also `nse_funcs[["pkgname::fun_name"]] <- fun` if `pkgname` exists. There's less bookkeeping that way that could possibly get mucked up!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
thisisnic commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1128966806

   Given that currently we use a "single-bucket" approach, I'd keep it that way in the short term, unless there's a specific reason to change it?  We can update it later if necessary.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r875236432


##########
r/R/dplyr-funcs-utils.R:
##########
@@ -0,0 +1,24 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+register_bindings_utils <- function() {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914891107


##########
r/tests/testthat/test-dplyr-filter.R:
##########
@@ -198,7 +198,7 @@ test_that("Negative scalar values", {
 test_that("filter() with between()", {
   compare_dplyr_binding(
     .input %>%
-      filter(between(dbl, 1, 2)) %>%
+      filter(dplyr::between(dbl, 1, 2)) %>%

Review Comment:
   Sure. I added some tests for the namespace qualified version.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914882478


##########
r/tests/testthat/test-dplyr-funcs.R:
##########
@@ -19,21 +19,24 @@ test_that("register_binding() works", {
   fake_registry <- new.env(parent = emptyenv())
   fun1 <- function() NULL
 
-  expect_null(register_binding("some_fun", fun1, fake_registry))
+  expect_null(register_binding("some.pkg::some_fun", fun1, fake_registry))
   expect_identical(fake_registry$some_fun, fun1)
+  expect_identical(fake_registry$`some.pkg::some_fun`, fun1)
 
-  expect_identical(register_binding("some_fun", NULL, fake_registry), fun1)
+  expect_identical(register_binding("some.pkg::some_fun", NULL, fake_registry), fun1)
   expect_false("some_fun" %in% names(fake_registry))
-  expect_silent(expect_null(register_binding("some_fun", NULL, fake_registry)))
+  expect_silent(expect_null(register_binding("some.pkg::some_fun", NULL, fake_registry)))
 
-  expect_null(register_binding("some_pkg::some_fun", fun1, fake_registry))
+  expect_null(register_binding("somePkg::some_fun", fun1, fake_registry))
   expect_identical(fake_registry$some_fun, fun1)
 })
 
 test_that("register_binding_agg() works", {
   fake_registry <- new.env(parent = emptyenv())
   fun1 <- function() NULL
 
-  expect_null(register_binding_agg("some_fun", fun1, fake_registry))
+  expect_null(register_binding_agg("somePkg::some_fun", fun1, fake_registry))
+  names(fake_registry)

Review Comment:
   Removed. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
thisisnic commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873660800


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   I haven't thought about this in great detail, but I'd be expecting an implementation of this to use the `namespace` variable here, or remove it.
   
   I'm just wondering if there might be some advantages to recording the function and namespace separately instead of appending them?  The former feels "cleaner" in my head, though that's not sufficient reason to suggest changing it.
   
   A thought; how do we distinguish between different kinds of functions?  i.e. we have:
   
   * bindings for functions from packages, e.g. `lubridate::as_datetime()`
   * bindings directly to arrow compute functions, e.g. `arrow_utf8_split_whitespace()`
   * other bindings list the one for `cast()`
   
   In the last case here, if we have other functions which have underscores but no namespace, I think the regex would no longer be extracting the right name.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
thisisnic commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873711442


##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,16 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
+  name <- gsub("^.*?_", "", fun_name)
   namespace <- gsub("::.*$", "", fun_name)

Review Comment:
   Nothing that jumps out at me right now, looks good!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1129018942

   @paleolimbot @thisisnic @jonkeane Would you mind having another look? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1128610905

   It sounds like having a defused version of the `::` operator is the way to go (instead of syntax translation). I have a proposal for a `::` binding.
   Another design choice would be the location for the bindings. Do we want?
   * a single bucket (i.e. `nse_funcs`) to hold all of them?
   * individual bucket per namespace: `use_funcs[[package]]`
   * a mixed approach:
       * a general bucket: `use_funcs[[all_function]]` where all bindings live (in their namespace-unqualified form - i.e. `as_datetime()`
       * package specific buckets: `use_funcs[[package]]`
   
   I am in favour of the _mixed_ approach, which would make the `::` binding something like.
   ```r
   arrow:::register_binding("::", function(lhs, rhs) {
     lhs_name <- as.character(substitute(lhs))
     rhs_name <- as.character(substitute(rhs))
     arrow:::nse_funcs[[lhs_name]][[rhs_name]]
   })
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with {{pkg::}} prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r873083672


##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -303,7 +303,7 @@ register_bindings_datetime_conversion <- function() {
     )
   })
 
-  register_binding("as_datetime", function(x,
+  register_binding("lubridate_as_datetime", function(x,

Review Comment:
   I only did this as a test for 2 bindings: `as_datetime` and `make_difftime`. I expect some tests to fail.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1133110566

   I have no issue with either `lubridate::date` or `date` calls in interactive sessions. I think failures for the un-prefixed form are due to the way testthat works, so we shouldn't be concerned with those. Am I wrong?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922422925


##########
r/tests/testthat/test-dplyr-summarize.R:
##########
@@ -459,6 +482,35 @@ test_that("quantile()", {
   )
 })
 
+test_that("quantile() with namespacing", {
+  suppressWarnings(
+    expect_warning(
+      expect_equal(
+        tbl %>%
+          group_by(some_grouping) %>%
+          summarize(
+            q_dbl = quantile(dbl, probs = 0.5, na.rm = TRUE, names = FALSE),
+            q_int = as.double(
+              quantile(int, probs = 0.5, na.rm = TRUE, names = FALSE)
+            )
+          ) %>%
+          arrange(some_grouping),
+        Table$create(tbl) %>%
+          group_by(some_grouping) %>%
+          summarize(
+            q_dbl = stats::quantile(dbl, probs = 0.5, na.rm = TRUE),

Review Comment:
   My point was that because you're calling both namespaced and non-namespaced versions in the same call, you're not testing that both versions raise the warning: you expect one warning, which could come from any of them, and suppress the rest of the warnings. But I looked again and the warning is coming from the `quantile` registered function, so as long as you're getting the same numbers either way you call it, you're passing through the same code that triggers the warnings. So it's ok.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916229652


##########
r/tests/testthat/test-dplyr-funcs-conditional.R:
##########
@@ -29,7 +29,7 @@ test_that("if_else and ifelse", {
   compare_dplyr_binding(
     .input %>%
       mutate(
-        y = if_else(int > 5, 1, 0)
+        y = dplyr::if_else(int > 5, 1, 0)

Review Comment:
   Removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1185417607

   The CI failures do not seem to be related to this PR:
   * the lint involves a file that I do not have in my branch, which leads me to believe the lintr workflow is misconfigured (we might not be installing the package before linting it - see https://github.com/r-lib/actions/pull/558 and https://github.com/r-lib/actions/pull/216)
   *  the other failures seem to involve R-devel and, as far as I can tell, are not triggered by changes in this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1185484105

   Yeah the R-devel issue can be ignored.
   
   Re: the linting, can you rebase to pull those changes in, then fix them? Odd that they didn't show up on my PR that touched those files, but maybe that's related to the underlying issue of why lintr seems confused.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921531545


##########
r/tests/testthat/test-util.R:
##########
@@ -39,3 +39,34 @@ test_that("as_writable_table() errors for invalid input", {
   # make sure other errors make it through
   expect_snapshot_error(wrapper_fun(data.frame(x = I(list(1, "a")))))
 })
+
+test_that("all_funs() identifies namespace-qualified and unqualified functions", {
+  expect_equal(
+    arrow:::all_funs(rlang::quo(pkg::fun())),

Review Comment:
   Done. Removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r921735078


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -201,6 +212,14 @@ test_that("strftime", {
     times
   )
 
+  # with namespacing

Review Comment:
   That pattern is probably a result of an earlier review from me - I think it would be sufficient to test the mechanism for `::` in each verb without checking that the namespaced version of every function works. I wouldn't have expected the duplicated tests unless there were functions that were special-cased (when reading the diff it made me wonder if some functions *were* special-cased).



##########
r/R/dplyr-summarize.R:
##########
@@ -161,6 +161,21 @@ register_bindings_aggregate <- function() {
   })
 }
 
+# we register 2 version of the "::" binding - one for use with nse_funcs
+# and another one for use with agg_funcs (below)

Review Comment:
   Maybe?
   
   ```suggestion
   # we register 2 version of the "::" binding - one for use with agg_funcs
   # and another one for use with nse_funcs (above)
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,15 +58,34 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  qualified_name <- fun_name
+  if (qualified_name == "::") {
+    unqualified_name <- "::"
+  } else {
+    unqualified_name <- gsub("^.*?::", "", qualified_name)
+  }
 
-  previous_fun <- if (name %in% names(registry)) registry[[name]] else NULL
+  previous_fun <- if (unqualified_name %in% names(registry)) registry[[unqualified_name]] else NULL
+
+  # if th unqualified name exists in the register, warn
+  if (!is.null(fun) && !is.null(previous_fun)) {
+    warn(
+      paste0(
+        "A \"",
+        unqualified_name,
+        "\" binding already exists in the register and will be overwritten.")
+    )
+  }
 
+  # if fun is NULL remove entries from the function registry

Review Comment:
   I am almost certainly who added the NULL-means-remove thing...the ability to remove a binding is only relevant if or when we export this function (so if you haven't already and/or it makes something in this PR difficult, feel free to remove it!)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r922125642


##########
r/tests/testthat/test-dplyr-summarize.R:
##########
@@ -459,6 +482,35 @@ test_that("quantile()", {
   )
 })
 
+test_that("quantile() with namespacing", {
+  suppressWarnings(
+    expect_warning(
+      expect_equal(
+        tbl %>%
+          group_by(some_grouping) %>%
+          summarize(
+            q_dbl = quantile(dbl, probs = 0.5, na.rm = TRUE, names = FALSE),
+            q_int = as.double(
+              quantile(int, probs = 0.5, na.rm = TRUE, names = FALSE)
+            )
+          ) %>%
+          arrange(some_grouping),
+        Table$create(tbl) %>%
+          group_by(some_grouping) %>%
+          summarize(
+            q_dbl = stats::quantile(dbl, probs = 0.5, na.rm = TRUE),

Review Comment:
   Yep, that's how I did it: [namespaced version](https://github.com/dragosmg/arrow/blob/642c23ee865140bdbdee7e416558aecc580f8500/r/tests/testthat/test-dplyr-summarize.R#L485-L512) and [non-namespaced version](https://github.com/dragosmg/arrow/blob/642c23ee865140bdbdee7e416558aecc580f8500/r/tests/testthat/test-dplyr-summarize.R#L404-L429). 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914895465


##########
r/R/arrow-datum.R:
##########
@@ -103,10 +103,10 @@ Ops.ArrowDatum <- function(e1, e2) {
 #' @export
 Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
   switch(.Generic,
-    abs = ,
-    sign = ,
-    floor = ,
-    ceiling = ,
+    abs = eval_array_expression("abs_checked", x),
+    sign = eval_array_expression("sign", x),
+    floor = eval_array_expression("floor", x),
+    ceiling = eval_array_expression("ceil", x),

Review Comment:
   Is it necessary to update `.unary_function_map`? If not, I think it may be better to fix this in `eval_array_expression()` since it's not clear to me if the functions below will or will not work now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914803637


##########
r/tests/testthat/test-dataset-dplyr.R:
##########
@@ -73,7 +73,7 @@ test_that("filter() on timestamp columns", {
   ds <- open_dataset(dataset_dir, partitioning = schema(part = uint8()))
   expect_equal(
     ds %>%
-      filter(ts >= lubridate::ymd_hms("2015-05-04 03:12:39")) %>%
+      filter(ts >= lubridate::ymd_hms("2015-05-04 03:12:39", tz = "UTC")) %>%

Review Comment:
   Is this change related to this PR?



##########
r/R/arrow-datum.R:
##########
@@ -103,10 +103,10 @@ Ops.ArrowDatum <- function(e1, e2) {
 #' @export
 Math.ArrowDatum <- function(x, ..., base = exp(1), digits = 0) {
   switch(.Generic,
-    abs = ,
-    sign = ,
-    floor = ,
-    ceiling = ,
+    abs = eval_array_expression("abs_checked", x),
+    sign = eval_array_expression("sign", x),
+    floor = eval_array_expression("floor", x),
+    ceiling = eval_array_expression("ceil", x),

Review Comment:
   This change seems unrelated to this PR. If it's not directly related to the `pkg::` thing, you should remove it.



##########
r/R/dplyr-funcs.R:
##########
@@ -116,3 +124,11 @@ create_binding_cache <- function() {
 nse_funcs <- new.env(parent = emptyenv())
 agg_funcs <- new.env(parent = emptyenv())
 .cache <- new.env(parent = emptyenv())
+
+register_bindings_utils <- function() {
+  register_binding("::", function(lhs, rhs) {
+    lhs_name <- as.character(substitute(lhs))
+    rhs_name <- as.character(substitute(rhs))
+    nse_funcs[[paste0(lhs_name, "::", rhs_name)]]

Review Comment:
   Here you should check if the function is actually in `nse_funcs`...if it isn't, you should return what base R would return which I think is `asNamespace(lhs_name)[[rhs_name]]`.
   
   (this is what's causing what you thought was an error in `filter()` but I think also affected `mutate()`)



##########
r/tests/testthat/test-dplyr-filter.R:
##########
@@ -198,7 +198,7 @@ test_that("Negative scalar values", {
 test_that("filter() with between()", {
   compare_dplyr_binding(
     .input %>%
-      filter(between(dbl, 1, 2)) %>%
+      filter(dplyr::between(dbl, 1, 2)) %>%

Review Comment:
   I wouldn't change this..."normal" usage is the test before the change...if you want to test namespacing you should do it in separate tests. Here and below you are testing two things at once: the behaviour of between and the behaviour of the namespacing. The namespacing is a separate concern and should be tested separately.



##########
r/tests/testthat/test-dplyr-funcs.R:
##########
@@ -19,21 +19,24 @@ test_that("register_binding() works", {
   fake_registry <- new.env(parent = emptyenv())
   fun1 <- function() NULL
 
-  expect_null(register_binding("some_fun", fun1, fake_registry))
+  expect_null(register_binding("some.pkg::some_fun", fun1, fake_registry))
   expect_identical(fake_registry$some_fun, fun1)
+  expect_identical(fake_registry$`some.pkg::some_fun`, fun1)
 
-  expect_identical(register_binding("some_fun", NULL, fake_registry), fun1)
+  expect_identical(register_binding("some.pkg::some_fun", NULL, fake_registry), fun1)
   expect_false("some_fun" %in% names(fake_registry))
-  expect_silent(expect_null(register_binding("some_fun", NULL, fake_registry)))
+  expect_silent(expect_null(register_binding("some.pkg::some_fun", NULL, fake_registry)))
 
-  expect_null(register_binding("some_pkg::some_fun", fun1, fake_registry))
+  expect_null(register_binding("somePkg::some_fun", fun1, fake_registry))
   expect_identical(fake_registry$some_fun, fun1)
 })
 
 test_that("register_binding_agg() works", {
   fake_registry <- new.env(parent = emptyenv())
   fun1 <- function() NULL
 
-  expect_null(register_binding_agg("some_fun", fun1, fake_registry))
+  expect_null(register_binding_agg("somePkg::some_fun", fun1, fake_registry))
+  names(fake_registry)

Review Comment:
   ```suggestion
     expect_null(register_binding_agg("somePkg::some_fun", fun1, fake_registry))
   ```



##########
r/R/dplyr-funcs.R:
##########
@@ -58,13 +58,20 @@ NULL
 #' @keywords internal
 #'
 register_binding <- function(fun_name, fun, registry = nse_funcs) {
-  name <- gsub("^.*?::", "", fun_name)
-  namespace <- gsub("::.*$", "", fun_name)
+  if (fun_name == "::") {

Review Comment:
   Maybe change `name` to `qualified_name` here to improve the readability?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on PR #13160:
URL: https://github.com/apache/arrow/pull/13160#issuecomment-1178068897

   One more thing (we can pair to see if we can fix this easily):
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   #> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.
   
   # we need to improve the failure mode here or fix this
   record_batch(x = 1:10) |> 
     dplyr::summarise(base::sum(x))
   #> Warning: Error in summarize_eval(names(exprs)[i], exprs[[i]], ctx, length(.data$group_by_vars) >  : 
   #>   Expression base::sum(x) is not an aggregate expression or is not supported in Arrow; pulling data into R
   #> # A tibble: 1 × 1
   #>   `base::sum(x)`
   #>            <int>
   #> 1             55
   ```
   
   <sup>Created on 2022-07-07 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916190693


##########
r/tests/testthat/test-dataset-dplyr.R:
##########
@@ -119,7 +119,7 @@ test_that("filter() on date32 columns", {
   # Also with timestamp scalar
   expect_equal(
     open_dataset(tmp) %>%
-      filter(date > lubridate::ymd_hms("2020-02-02 00:00:00")) %>%
+      filter(date > lubridate::ymd_hms("2020-02-02 00:00:00", tz = "UTC")) %>%

Review Comment:
   No. Removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r914903365


##########
r/R/dplyr-funcs.R:
##########
@@ -116,3 +124,11 @@ create_binding_cache <- function() {
 nse_funcs <- new.env(parent = emptyenv())
 agg_funcs <- new.env(parent = emptyenv())
 .cache <- new.env(parent = emptyenv())
+
+register_bindings_utils <- function() {
+  register_binding("::", function(lhs, rhs) {
+    lhs_name <- as.character(substitute(lhs))
+    rhs_name <- as.character(substitute(rhs))
+    nse_funcs[[paste0(lhs_name, "::", rhs_name)]]

Review Comment:
   Makes sense. TIL `asNamespace()` exists.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13160:
URL: https://github.com/apache/arrow/pull/13160#discussion_r916895736


##########
r/tests/testthat/test-dplyr-summarize.R:
##########
@@ -999,3 +999,12 @@ test_that("summarise() can handle scalars and literal values", {
     tibble(y = 2L)
   )
 })
+

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org