You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/29 01:04:28 UTC

[GitHub] [arrow] pachadotdev opened a new pull request #10615: Arrow12967v3

pachadotdev opened a new pull request #10615:
URL: https://github.com/apache/arrow/pull/10615


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#issuecomment-870149893


   https://issues.apache.org/jira/browse/ARROW-12967


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ianmcook commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
ianmcook commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660887960



##########
File path: r/tests/testthat/test-dplyr-mutate.R
##########
@@ -418,3 +418,24 @@ test_that("mutate and write_dataset", {
       summarize(mean = mean(integer))
   )
 })
+
+test_that("mutate and pmin/pmax", {
+  df <- tibble(
+    city = c("Chillan", "Valdivia", "Osorno"),
+    val1 = c(200, 300, NA),
+    val2 = c(100, NA, NA),
+    val3 = c(0, NA, NA)
+  )
+
+  expect_dplyr_equal(
+    input %>%
+      mutate(
+        max_val_1 = pmax(val1, val2, val3),
+        max_val_2 = pmax(val1, val2, val3, na.rm = T),
+        min_val_1 = pmin(val1, val2, val3),
+        min_val_2 = pmin(val1, val2, val3, na.rm = T)
+      ) %>%
+      collect(),
+    df
+  )

Review comment:
       Could you add some additional code in this test that uses a mix of:
   - scalar literal values
   - column references
   - expressions
   
   in the `pmin()` and `pmax()` calls? For example:
   
   ```r
   pmax(val1, 250, val2 * 4, val3, 50)
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660740294



##########
File path: r/R/dplyr-functions.R
##########
@@ -398,6 +398,22 @@ nse_funcs$str_split <- function(string, pattern, n = Inf, simplify = FALSE) {
   )
 }
 
+nse_funcs$pmin <- function(..., na.rm = FALSE) {
+  Expression$create(
+    "element_wise_min",
+    ...,
+    options = list(skip_nulls = na.rm)
+  )
+}
+
+nse_funcs$pmax <- function(..., na.rm = FALSE) {
+  Expression$create(
+    "element_wise_max",

Review comment:
       They're called `min_element_wise` and `max_element_wise` actually ;-)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pachadotdev commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
pachadotdev commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660889102



##########
File path: r/tests/testthat/test-dplyr-mutate.R
##########
@@ -418,3 +418,24 @@ test_that("mutate and write_dataset", {
       summarize(mean = mean(integer))
   )
 })
+
+test_that("mutate and pmin/pmax", {
+  df <- tibble(
+    city = c("Chillan", "Valdivia", "Osorno"),
+    val1 = c(200, 300, NA),
+    val2 = c(100, NA, NA),
+    val3 = c(0, NA, NA)
+  )
+
+  expect_dplyr_equal(
+    input %>%
+      mutate(
+        max_val_1 = pmax(val1, val2, val3),
+        max_val_2 = pmax(val1, val2, val3, na.rm = T),
+        min_val_1 = pmin(val1, val2, val3),
+        min_val_2 = pmin(val1, val2, val3, na.rm = T)
+      ) %>%
+      collect(),
+    df
+  )

Review comment:
       adding




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pachadotdev commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
pachadotdev commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660884851



##########
File path: r/src/compute.cpp
##########
@@ -180,6 +180,15 @@ std::shared_ptr<arrow::compute::FunctionOptions> make_compute_options(
     return out;
   }
 
+  if (func_name == "element_wise_min" || func_name == "element_wise_max") {
+    using Options = arrow::compute::ElementWiseAggregateOptions;
+    bool skip_nulls = false;

Review comment:
       thanks, yes i went full R to the bottom fro the smae reason




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ianmcook closed pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax()

Posted by GitBox <gi...@apache.org>.
ianmcook closed pull request #10615:
URL: https://github.com/apache/arrow/pull/10615


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ianmcook commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
ianmcook commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660882828



##########
File path: r/R/dplyr-functions.R
##########
@@ -398,6 +398,22 @@ nse_funcs$str_split <- function(string, pattern, n = Inf, simplify = FALSE) {
   )
 }
 
+nse_funcs$pmin <- function(..., na.rm = FALSE) {
+  Expression$create(

Review comment:
       This will allow users to pass literal scalar arguments to `pmin()` in dplyr:
   ```suggestion
     build_expr(
   ```

##########
File path: r/R/dplyr-functions.R
##########
@@ -398,6 +398,22 @@ nse_funcs$str_split <- function(string, pattern, n = Inf, simplify = FALSE) {
   )
 }
 
+nse_funcs$pmin <- function(..., na.rm = FALSE) {
+  Expression$create(
+    "element_wise_min",
+    ...,
+    options = list(skip_nulls = na.rm)
+  )
+}
+
+nse_funcs$pmax <- function(..., na.rm = FALSE) {
+  Expression$create(

Review comment:
       ```suggestion
     build_expr(
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ianmcook commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
ianmcook commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660872194



##########
File path: r/src/compute.cpp
##########
@@ -180,6 +180,15 @@ std::shared_ptr<arrow::compute::FunctionOptions> make_compute_options(
     return out;
   }
 
+  if (func_name == "element_wise_min" || func_name == "element_wise_max") {
+    using Options = arrow::compute::ElementWiseAggregateOptions;
+    bool skip_nulls = false;

Review comment:
       As mentioned here... https://github.com/apache/arrow/blob/57ecc73e6153fea04e0ac0d13792ba0abb0dd779/cpp/src/arrow/compute/kernels/scalar_compare.cc#L472... `skip_nulls = true` is the default, so let's make it the default here too:
   ```suggestion
       bool skip_nulls = true;
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ianmcook commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
ianmcook commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660887960



##########
File path: r/tests/testthat/test-dplyr-mutate.R
##########
@@ -418,3 +418,24 @@ test_that("mutate and write_dataset", {
       summarize(mean = mean(integer))
   )
 })
+
+test_that("mutate and pmin/pmax", {
+  df <- tibble(
+    city = c("Chillan", "Valdivia", "Osorno"),
+    val1 = c(200, 300, NA),
+    val2 = c(100, NA, NA),
+    val3 = c(0, NA, NA)
+  )
+
+  expect_dplyr_equal(
+    input %>%
+      mutate(
+        max_val_1 = pmax(val1, val2, val3),
+        max_val_2 = pmax(val1, val2, val3, na.rm = T),
+        min_val_1 = pmin(val1, val2, val3),
+        min_val_2 = pmin(val1, val2, val3, na.rm = T)
+      ) %>%
+      collect(),
+    df
+  )

Review comment:
       Could you add some additional code in this test that uses a mix of:
   - scalar literal values
   - column references
   - expressions
   
   in the `pmin()` and `pmax()` calls? For example:
   
   ```r
   pmax(df$val1, 250, df$val2 * 4, df$val3, 50)
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10615: Arrow12967v3

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#issuecomment-870149499


   <!--
     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.
   -->
   
   Thanks for opening a pull request!
   
   If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW
   
   Opening JIRAs ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project.
   
   Then could you also rename pull request title in the following format?
   
       ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   or
   
       MINOR: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
     * [Other pull requests](https://github.com/apache/arrow/pulls/)
     * [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10615: Arrow12967v3

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#issuecomment-870149499






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] removed a comment on pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
github-actions[bot] removed a comment on pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#issuecomment-870149499


   <!--
     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.
   -->
   
   Thanks for opening a pull request!
   
   If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW
   
   Opening JIRAs ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project.
   
   Then could you also rename pull request title in the following format?
   
       ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   or
   
       MINOR: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
     * [Other pull requests](https://github.com/apache/arrow/pulls/)
     * [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ianmcook commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
ianmcook commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660880091



##########
File path: r/src/compute.cpp
##########
@@ -180,6 +180,15 @@ std::shared_ptr<arrow::compute::FunctionOptions> make_compute_options(
     return out;
   }
 
+  if (func_name == "element_wise_min" || func_name == "element_wise_max") {
+    using Options = arrow::compute::ElementWiseAggregateOptions;
+    bool skip_nulls = false;

Review comment:
       The default of `pmin()` and `pmax()` is `na.rm = FALSE`, and the definitions you added in `dplyr-functions.R` are consistent with that 👍 so that's good. Generally here in `compute.cpp`, the options defaults should be consistent with the defaults of the C++ function options.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pachadotdev commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613

Posted by GitBox <gi...@apache.org>.
pachadotdev commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660904824



##########
File path: r/R/dplyr-functions.R
##########
@@ -398,6 +398,22 @@ nse_funcs$str_split <- function(string, pattern, n = Inf, simplify = FALSE) {
   )
 }
 
+nse_funcs$pmin <- function(..., na.rm = FALSE) {
+  Expression$create(
+    "element_wise_min",
+    ...,
+    options = list(skip_nulls = na.rm)
+  )
+}
+
+nse_funcs$pmax <- function(..., na.rm = FALSE) {
+  Expression$create(
+    "element_wise_max",

Review comment:
       yea... last nigth it worked




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org