You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/29 01:04:28 UTC
[GitHub] [arrow] pachadotdev opened a new pull request #10615: Arrow12967v3
pachadotdev opened a new pull request #10615:
URL: https://github.com/apache/arrow/pull/10615
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#issuecomment-870149893
https://issues.apache.org/jira/browse/ARROW-12967
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ianmcook commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
ianmcook commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660887960
##########
File path: r/tests/testthat/test-dplyr-mutate.R
##########
@@ -418,3 +418,24 @@ test_that("mutate and write_dataset", {
summarize(mean = mean(integer))
)
})
+
+test_that("mutate and pmin/pmax", {
+ df <- tibble(
+ city = c("Chillan", "Valdivia", "Osorno"),
+ val1 = c(200, 300, NA),
+ val2 = c(100, NA, NA),
+ val3 = c(0, NA, NA)
+ )
+
+ expect_dplyr_equal(
+ input %>%
+ mutate(
+ max_val_1 = pmax(val1, val2, val3),
+ max_val_2 = pmax(val1, val2, val3, na.rm = T),
+ min_val_1 = pmin(val1, val2, val3),
+ min_val_2 = pmin(val1, val2, val3, na.rm = T)
+ ) %>%
+ collect(),
+ df
+ )
Review comment:
Could you add some additional code in this test that uses a mix of:
- scalar literal values
- column references
- expressions
in the `pmin()` and `pmax()` calls? For example:
```r
pmax(val1, 250, val2 * 4, val3, 50)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660740294
##########
File path: r/R/dplyr-functions.R
##########
@@ -398,6 +398,22 @@ nse_funcs$str_split <- function(string, pattern, n = Inf, simplify = FALSE) {
)
}
+nse_funcs$pmin <- function(..., na.rm = FALSE) {
+ Expression$create(
+ "element_wise_min",
+ ...,
+ options = list(skip_nulls = na.rm)
+ )
+}
+
+nse_funcs$pmax <- function(..., na.rm = FALSE) {
+ Expression$create(
+ "element_wise_max",
Review comment:
They're called `min_element_wise` and `max_element_wise` actually ;-)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pachadotdev commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
pachadotdev commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660889102
##########
File path: r/tests/testthat/test-dplyr-mutate.R
##########
@@ -418,3 +418,24 @@ test_that("mutate and write_dataset", {
summarize(mean = mean(integer))
)
})
+
+test_that("mutate and pmin/pmax", {
+ df <- tibble(
+ city = c("Chillan", "Valdivia", "Osorno"),
+ val1 = c(200, 300, NA),
+ val2 = c(100, NA, NA),
+ val3 = c(0, NA, NA)
+ )
+
+ expect_dplyr_equal(
+ input %>%
+ mutate(
+ max_val_1 = pmax(val1, val2, val3),
+ max_val_2 = pmax(val1, val2, val3, na.rm = T),
+ min_val_1 = pmin(val1, val2, val3),
+ min_val_2 = pmin(val1, val2, val3, na.rm = T)
+ ) %>%
+ collect(),
+ df
+ )
Review comment:
adding
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pachadotdev commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
pachadotdev commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660884851
##########
File path: r/src/compute.cpp
##########
@@ -180,6 +180,15 @@ std::shared_ptr<arrow::compute::FunctionOptions> make_compute_options(
return out;
}
+ if (func_name == "element_wise_min" || func_name == "element_wise_max") {
+ using Options = arrow::compute::ElementWiseAggregateOptions;
+ bool skip_nulls = false;
Review comment:
thanks, yes i went full R to the bottom fro the smae reason
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ianmcook closed pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax()
Posted by GitBox <gi...@apache.org>.
ianmcook closed pull request #10615:
URL: https://github.com/apache/arrow/pull/10615
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ianmcook commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
ianmcook commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660882828
##########
File path: r/R/dplyr-functions.R
##########
@@ -398,6 +398,22 @@ nse_funcs$str_split <- function(string, pattern, n = Inf, simplify = FALSE) {
)
}
+nse_funcs$pmin <- function(..., na.rm = FALSE) {
+ Expression$create(
Review comment:
This will allow users to pass literal scalar arguments to `pmin()` in dplyr:
```suggestion
build_expr(
```
##########
File path: r/R/dplyr-functions.R
##########
@@ -398,6 +398,22 @@ nse_funcs$str_split <- function(string, pattern, n = Inf, simplify = FALSE) {
)
}
+nse_funcs$pmin <- function(..., na.rm = FALSE) {
+ Expression$create(
+ "element_wise_min",
+ ...,
+ options = list(skip_nulls = na.rm)
+ )
+}
+
+nse_funcs$pmax <- function(..., na.rm = FALSE) {
+ Expression$create(
Review comment:
```suggestion
build_expr(
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ianmcook commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
ianmcook commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660872194
##########
File path: r/src/compute.cpp
##########
@@ -180,6 +180,15 @@ std::shared_ptr<arrow::compute::FunctionOptions> make_compute_options(
return out;
}
+ if (func_name == "element_wise_min" || func_name == "element_wise_max") {
+ using Options = arrow::compute::ElementWiseAggregateOptions;
+ bool skip_nulls = false;
Review comment:
As mentioned here... https://github.com/apache/arrow/blob/57ecc73e6153fea04e0ac0d13792ba0abb0dd779/cpp/src/arrow/compute/kernels/scalar_compare.cc#L472... `skip_nulls = true` is the default, so let's make it the default here too:
```suggestion
bool skip_nulls = true;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ianmcook commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
ianmcook commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660887960
##########
File path: r/tests/testthat/test-dplyr-mutate.R
##########
@@ -418,3 +418,24 @@ test_that("mutate and write_dataset", {
summarize(mean = mean(integer))
)
})
+
+test_that("mutate and pmin/pmax", {
+ df <- tibble(
+ city = c("Chillan", "Valdivia", "Osorno"),
+ val1 = c(200, 300, NA),
+ val2 = c(100, NA, NA),
+ val3 = c(0, NA, NA)
+ )
+
+ expect_dplyr_equal(
+ input %>%
+ mutate(
+ max_val_1 = pmax(val1, val2, val3),
+ max_val_2 = pmax(val1, val2, val3, na.rm = T),
+ min_val_1 = pmin(val1, val2, val3),
+ min_val_2 = pmin(val1, val2, val3, na.rm = T)
+ ) %>%
+ collect(),
+ df
+ )
Review comment:
Could you add some additional code in this test that uses a mix of:
- scalar literal values
- column references
- expressions
in the `pmin()` and `pmax()` calls? For example:
```r
pmax(df$val1, 250, df$val2 * 4, df$val3, 50)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #10615: Arrow12967v3
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#issuecomment-870149499
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
Thanks for opening a pull request!
If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW
Opening JIRAs ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project.
Then could you also rename pull request title in the following format?
ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
or
MINOR: [${COMPONENT}] ${SUMMARY}
See also:
* [Other pull requests](https://github.com/apache/arrow/pulls/)
* [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #10615: Arrow12967v3
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#issuecomment-870149499
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] removed a comment on pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
github-actions[bot] removed a comment on pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#issuecomment-870149499
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
Thanks for opening a pull request!
If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW
Opening JIRAs ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project.
Then could you also rename pull request title in the following format?
ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
or
MINOR: [${COMPONENT}] ${SUMMARY}
See also:
* [Other pull requests](https://github.com/apache/arrow/pulls/)
* [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ianmcook commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
ianmcook commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660880091
##########
File path: r/src/compute.cpp
##########
@@ -180,6 +180,15 @@ std::shared_ptr<arrow::compute::FunctionOptions> make_compute_options(
return out;
}
+ if (func_name == "element_wise_min" || func_name == "element_wise_max") {
+ using Options = arrow::compute::ElementWiseAggregateOptions;
+ bool skip_nulls = false;
Review comment:
The default of `pmin()` and `pmax()` is `na.rm = FALSE`, and the definitions you added in `dplyr-functions.R` are consistent with that 👍 so that's good. Generally here in `compute.cpp`, the options defaults should be consistent with the defaults of the C++ function options.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pachadotdev commented on a change in pull request #10615: ARROW-12967: [R] Add bindings for pmin() and pmax() #10497 #10613
Posted by GitBox <gi...@apache.org>.
pachadotdev commented on a change in pull request #10615:
URL: https://github.com/apache/arrow/pull/10615#discussion_r660904824
##########
File path: r/R/dplyr-functions.R
##########
@@ -398,6 +398,22 @@ nse_funcs$str_split <- function(string, pattern, n = Inf, simplify = FALSE) {
)
}
+nse_funcs$pmin <- function(..., na.rm = FALSE) {
+ Expression$create(
+ "element_wise_min",
+ ...,
+ options = list(skip_nulls = na.rm)
+ )
+}
+
+nse_funcs$pmax <- function(..., na.rm = FALSE) {
+ Expression$create(
+ "element_wise_max",
Review comment:
yea... last nigth it worked
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org