You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/17 14:26:20 UTC

[GitHub] [arrow] paleolimbot opened a new pull request, #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

paleolimbot opened a new pull request, #14436:
URL: https://github.com/apache/arrow/pull/14436

   This PR makes it so that you can do the following without a warning:
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   
   register_scalar_function(
     "times_32",
     function(context, x) x * 32L,
     in_type = list(int32(), int64(), float64()),
     out_type = function(in_types) in_types[[1]],
     auto_convert = TRUE
   )
   
   register_scalar_function(
     "times_32",
     function(context, x) x * 32L,
     in_type = list(int32(), int64(), float64()),
     out_type = function(in_types) in_types[[1]],
     auto_convert = TRUE
   )
   ```
   
   In fixing that, I also ran across an important discovery, which is that `cpp11::function` does *not* protect the underlying `SEXP` from garbage collection!!!! It the two functions we used this for were being protected by proxy because the execution environment of `register_scalar_function()` was being saved when the binding was registered.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #14436:
URL: https://github.com/apache/arrow/pull/14436#discussion_r998311684


##########
r/src/recordbatchreader.cpp:
##########
@@ -70,7 +70,7 @@ class RFunctionRecordBatchReader : public arrow::RecordBatchReader {
 
   arrow::Status ReadNext(std::shared_ptr<arrow::RecordBatch>* batch_out) {
     auto batch = SafeCallIntoR<std::shared_ptr<arrow::RecordBatch>>([&]() {
-      cpp11::sexp result_sexp = fun_();
+      cpp11::sexp result_sexp = cpp11::function(fun_)();

Review Comment:
   Does it potentially defer any type check that would have occurred in the constructor before?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #14436:
URL: https://github.com/apache/arrow/pull/14436#issuecomment-1282539163

   @nealrichardson @thisisnic  Could you comment on the R parts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #14436:
URL: https://github.com/apache/arrow/pull/14436#issuecomment-1281440684

   Revision: 0facb9ba28af0d10039c168d077cf564c924aaa2
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-6b8a20d344](https://github.com/ursacomputing/crossbow/branches/all?query=actions-6b8a20d344)
   
   |Task|Status|
   |----|------|
   |homebrew-r-autobrew|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-6b8a20d344-github-homebrew-r-autobrew)](https://github.com/ursacomputing/crossbow/actions/runs/3268300786/jobs/5374546590)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on PR #14436:
URL: https://github.com/apache/arrow/pull/14436#issuecomment-1281013812

   > Do you think that should be reported as a bug?
   
   I opened an issue about it...it might have been intentional since protection isn't necessarily cheap compared to the function call itself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #14436:
URL: https://github.com/apache/arrow/pull/14436#issuecomment-1280948862

   :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #14436:
URL: https://github.com/apache/arrow/pull/14436#discussion_r997174084


##########
r/R/compute.R:
##########
@@ -379,9 +379,15 @@ register_scalar_function <- function(name, fun, in_type, out_type,
   RegisterScalarUDF(name, scalar_function)
 
   # register with dplyr binding (enables its use in mutate(), filter(), etc.)
+  # extra step to avoid saving this execution environment in the binding,
+  # which eliminates a warning when the same binding is registered twice
+  binding_fun <- function(...) build_expr(name, ...)
+  body(binding_fun)[[2]] <- name

Review Comment:
   I pushed a less cryptic version of this...it's inlining the value of `name` into the function body:
   
   ``` r
   binding_fun <- function(...) build_expr(name, ...)
   str(as.list(body(binding_fun)))
   #> List of 3
   #>  $ : symbol build_expr
   #>  $ : symbol name
   #>  $ : symbol ...
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #14436:
URL: https://github.com/apache/arrow/pull/14436#discussion_r998337787


##########
r/src/recordbatchreader.cpp:
##########
@@ -70,7 +70,7 @@ class RFunctionRecordBatchReader : public arrow::RecordBatchReader {
 
   arrow::Status ReadNext(std::shared_ptr<arrow::RecordBatch>* batch_out) {
     auto batch = SafeCallIntoR<std::shared_ptr<arrow::RecordBatch>>([&]() {
-      cpp11::sexp result_sexp = fun_();
+      cpp11::sexp result_sexp = cpp11::function(fun_)();

Review Comment:
   That's a good point...I looked that up, and its current form, `cpp11::function` doesn't do any type checking ( https://github.com/r-lib/cpp11/blob/main/inst/include/cpp11/function.hpp#L17 )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on a diff in pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
pitrou commented on code in PR #14436:
URL: https://github.com/apache/arrow/pull/14436#discussion_r997145331


##########
r/R/compute.R:
##########
@@ -379,9 +379,15 @@ register_scalar_function <- function(name, fun, in_type, out_type,
   RegisterScalarUDF(name, scalar_function)
 
   # register with dplyr binding (enables its use in mutate(), filter(), etc.)
+  # extra step to avoid saving this execution environment in the binding,
+  # which eliminates a warning when the same binding is registered twice
+  binding_fun <- function(...) build_expr(name, ...)
+  body(binding_fun)[[2]] <- name

Review Comment:
   Hmm, what does this do? Why 2?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] pitrou commented on pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
pitrou commented on PR #14436:
URL: https://github.com/apache/arrow/pull/14436#issuecomment-1280962692

   > In fixing that, I also ran across an important discovery, which is that `cpp11::function` does _not_ protect the underlying `SEXP` from garbage collection
   
   Do you think that should be reported as a bug?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on PR #14436:
URL: https://github.com/apache/arrow/pull/14436#issuecomment-1281388393

   @github-actions crossbow submit homebrew-r-autobrew


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #14436:
URL: https://github.com/apache/arrow/pull/14436#issuecomment-1280948774

   https://issues.apache.org/jira/browse/ARROW-17460


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on PR #14436:
URL: https://github.com/apache/arrow/pull/14436#issuecomment-1282399777

   It looks like it doesn't fix that specific test, although I still think we should merge this sooner than later just in case it was contributing to other failures!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot merged pull request #14436: ARROW-17460: [R] Don't warn if the new UDF I'm registering is the same as the existing one

Posted by GitBox <gi...@apache.org>.
paleolimbot merged PR #14436:
URL: https://github.com/apache/arrow/pull/14436


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org