You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/11 12:58:36 UTC

[GitHub] [arrow] dragosmg opened a new pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

dragosmg opened a new pull request #12402:
URL: https://github.com/apache/arrow/pull/12402


   This PR aligns arrow's binding to `strptime()` to `base::strptime()` when the value passed to the `format` argument does not match the data. Currently arrow errors, when it should return `NA`. 
   
   ``` r
   suppressMessages(library(lubridate))
   suppressMessages(library(arrow))
   suppressMessages(library(dplyr))
   
   df <- tibble(x = "2022-02-11")
   
   df %>% 
     mutate(z = strptime(x, format = "%Y-%m %d"))
   #> # A tibble: 1 × 2
   #>   x          z     
   #>   <chr>      <dttm>
   #> 1 2022-02-11 NA
   
   df %>% 
     record_batch() %>% 
     mutate(z = strptime(x, format = "%Y-%m %d")) %>% 
     collect()
   #> Error: Invalid: Failed to parse string: '2022-02-11' as a scalar of type timestamp[ms]
   ```
   
   <sup>Created on 2022-02-11 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane closed pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
jonkeane closed pull request #12402:
URL: https://github.com/apache/arrow/pull/12402


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Finished :arrow_down:0.21% :arrow_up:0.04%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Finished :arrow_down:0.04% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/419| `ba04e7f8` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/405| `ba04e7f8` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/415| `ba04e7f8` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/418| `64560af6` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/404| `64560af6` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/414| `64560af6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1078776962


   I think several things are needed:
   
   - [ ] new unit tests to show similar behaviour (returning `NA` in the same circumstances)
   - [ ] update existing unit tests. These are not using `compare_dplyr_binding()` and are not catching the fact that Arrow's `strptime` (`POSIXct`, a double vector) binding returns a different class that `base::strptime()` (`POSIXlt`, a list)
        
    `waldo::compare()` does ignore `class` when `ignore_attr = TRUE`, but it can't ignore `typeof`. I don't think we want to change the type of object Arrow's `strptime` binding returns. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on a change in pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
jonkeane commented on a change in pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#discussion_r836771857



##########
File path: r/R/dplyr-funcs-datetime.R
##########
@@ -31,7 +31,7 @@ register_bindings_datetime <- function() {
 
     unit <- make_valid_time_unit(unit, c(valid_time64_units, valid_time32_units))
 
-    Expression$create("strptime", x, options = list(format = format, unit = unit))
+    build_expr("strptime", x, options = list(format = format, unit = unit, error_is_null = TRUE))
   })

Review comment:
       Are you thinking of exposing it as an argument in our `strptime` binding? 
   
   We probably don't need that, since someone could do somethign like `arrow_strptime(..., error_is_null = TRUE)` to get that behavior. If we were to expose this to end users, I would recommend doing it via an option, so that all of the various places that use `strptime` in our bindings would all work the same — though even that I'm a bit doubtful someone will want to control. So I would say let's wait for someone to ask for it before implementing it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/419| `ba04e7f8` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/418| `64560af6` ec2-t3-xlarge-us-east-2>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on a change in pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
jonkeane commented on a change in pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#discussion_r837694040



##########
File path: r/tests/testthat/test-dplyr-funcs-datetime.R
##########
@@ -118,6 +118,51 @@ test_that("errors in strptime", {
   )
 })
 
+test_that("strptime returns NA when format doesn't match the data", {
+  df <- tibble(
+    str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07")

Review comment:
       should we have an `NA` here too?

##########
File path: r/tests/testthat/test-dplyr-funcs-datetime.R
##########
@@ -118,6 +118,51 @@ test_that("errors in strptime", {
   )
 })
 
+test_that("strptime returns NA when format doesn't match the data", {
+  df <- tibble(
+    str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07")
+  )
+
+  expect_equal(
+    df %>%
+      arrow_table() %>%
+      mutate(
+        r_obj_parsed_date = strptime("03-27/2022", format = "%m-%d/%Y"),
+        r_obj_parsed_na = strptime("03-27/2022", format = "Y%-%m-%d")) %>%
+      collect(),
+    tibble(
+      str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07"),
+      r_obj_parsed_date = as.POSIXct(rep("2022-03-27", 4)),
+      r_obj_parsed_na = as.POSIXct(rep(NA, 4))
+    ),
+    ignore_attr = "tzone"
+  )

Review comment:
       Besides tzone, we expect dplyr and arrow execution methods to be the same now, yeah? Could we use `compare_dplyr_binding`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1036187847






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg commented on a change in pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg commented on a change in pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#discussion_r837852972



##########
File path: r/tests/testthat/test-dplyr-funcs-datetime.R
##########
@@ -118,6 +118,51 @@ test_that("errors in strptime", {
   )
 })
 
+test_that("strptime returns NA when format doesn't match the data", {
+  df <- tibble(
+    str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07")
+  )
+
+  expect_equal(
+    df %>%
+      arrow_table() %>%
+      mutate(
+        r_obj_parsed_date = strptime("03-27/2022", format = "%m-%d/%Y"),
+        r_obj_parsed_na = strptime("03-27/2022", format = "Y%-%m-%d")) %>%
+      collect(),
+    tibble(
+      str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07"),
+      r_obj_parsed_date = as.POSIXct(rep("2022-03-27", 4)),
+      r_obj_parsed_na = as.POSIXct(rep(NA, 4))
+    ),
+    ignore_attr = "tzone"
+  )

Review comment:
       Nope, `base::strptime()` and the Arrow binding return different types. `ignore_attr` cannot help here since that covers only classes, not types. `base::strptime()` returns a list (`POSIXlt`), while the Arrow binding returns a `double` (`POSIXct`) vector.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg commented on a change in pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg commented on a change in pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#discussion_r837855927



##########
File path: r/tests/testthat/test-dplyr-funcs-datetime.R
##########
@@ -118,6 +118,51 @@ test_that("errors in strptime", {
   )
 })
 
+test_that("strptime returns NA when format doesn't match the data", {
+  df <- tibble(
+    str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07")

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot commented on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot commented on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Scheduled] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg commented on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg commented on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1078776962


   I think several things are needed here:
   
   - [ ] new unit tests to show similar behaviour (returning `NA` in the same circumstances)
   
   - [ ] update existing unit tests. These are not using `compare_dplyr_binding()` and are not catching the fact that Arrow's `strptime` (`datetime`) binding returns a different class that `base::strptime()` (`POSIXlt`)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1078776962


   I think several things are needed:
   
   - [ ] new unit tests to show similar behaviour (returning `NA` in the same circumstances)
   - [ ] update existing unit tests. These are not using `compare_dplyr_binding()` and are not catching the fact that Arrow's `strptime` binding returns an object of a different class (`POSIXct`, a double vector) that `base::strptime()` (`POSIXlt`, a list)
        
    `waldo::compare()` does ignore `class` when `ignore_attr = TRUE`, but it can't ignore `typeof`. I'm not sure we want to change the type of object Arrow's `strptime` binding returns. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg commented on a change in pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg commented on a change in pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#discussion_r837852972



##########
File path: r/tests/testthat/test-dplyr-funcs-datetime.R
##########
@@ -118,6 +118,51 @@ test_that("errors in strptime", {
   )
 })
 
+test_that("strptime returns NA when format doesn't match the data", {
+  df <- tibble(
+    str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07")
+  )
+
+  expect_equal(
+    df %>%
+      arrow_table() %>%
+      mutate(
+        r_obj_parsed_date = strptime("03-27/2022", format = "%m-%d/%Y"),
+        r_obj_parsed_na = strptime("03-27/2022", format = "Y%-%m-%d")) %>%
+      collect(),
+    tibble(
+      str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07"),
+      r_obj_parsed_date = as.POSIXct(rep("2022-03-27", 4)),
+      r_obj_parsed_na = as.POSIXct(rep(NA, 4))
+    ),
+    ignore_attr = "tzone"
+  )

Review comment:
       Nope, `base::strptime()` and the Arrow binding return different types. `ignore_attr` cannot help here since that covers only classes, not types. `base::strptime()` returns a list (`POSIXlt`), while the Arrow binding returns a `double` (`POSIXct`) vector => `compare_dplyr_binding()` would always fail.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Finished :arrow_down:0.04% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/419| `ba04e7f8` ec2-t3-xlarge-us-east-2>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/405| `ba04e7f8` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/415| `ba04e7f8` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/418| `64560af6` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/404| `64560af6` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/414| `64560af6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Finished :arrow_down:0.21% :arrow_up:0.04%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Finished :arrow_down:0.04% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/419| `ba04e7f8` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/405| `ba04e7f8` test-mac-arm>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/405| `ba04e7f8` ursa-i9-9960x>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/415| `ba04e7f8` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/418| `64560af6` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/404| `64560af6` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/404| `64560af6` ursa-i9-9960x>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/414| `64560af6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg commented on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg commented on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1080618740


   @jonkeane what do you think? Is the type of the object returned by `strptime` of concern? `base::strptime()` returns a list (`POSIXlt`), while the Arrow one returns an atomic vector. We hadn't caught this before as the unit tests for `strptime` did not compare the {dplyr} and {arrow} pipelines.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1078776962


   I think several things are needed:
   
   - [ ] new unit tests to show similar behaviour (returning `NA` in the same circumstances)
   - [ ] update existing unit tests. These are not using `compare_dplyr_binding()` and are not catching the fact that Arrow's `strptime` (`datetime`) binding returns a different class that `base::strptime()` (`POSIXlt`)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Failed] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on a change in pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
jonkeane commented on a change in pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#discussion_r837926108



##########
File path: r/tests/testthat/test-dplyr-funcs-datetime.R
##########
@@ -118,6 +118,51 @@ test_that("errors in strptime", {
   )
 })
 
+test_that("strptime returns NA when format doesn't match the data", {
+  df <- tibble(
+    str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07")
+  )
+
+  expect_equal(
+    df %>%
+      arrow_table() %>%
+      mutate(
+        r_obj_parsed_date = strptime("03-27/2022", format = "%m-%d/%Y"),
+        r_obj_parsed_na = strptime("03-27/2022", format = "Y%-%m-%d")) %>%
+      collect(),
+    tibble(
+      str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07"),
+      r_obj_parsed_date = as.POSIXct(rep("2022-03-27", 4)),
+      r_obj_parsed_na = as.POSIXct(rep(NA, 4))
+    ),
+    ignore_attr = "tzone"
+  )

Review comment:
       Ah right right. It's a bit more verbose, but what about doing something like:
   
   ```
   expect_equal(
       df %>%
         arrow_table() %>%
         mutate(
           r_obj_parsed_date = strptime("03-27/2022", format = "%m-%d/%Y"),
           r_obj_parsed_na = strptime("03-27/2022", format = "Y%-%m-%d")) %>%
         collect(),
       df %>%
         arrow_table() %>%
         mutate(
           r_obj_parsed_date = as.POSIXct(strptime("03-27/2022", format = "%m-%d/%Y")),
           r_obj_parsed_na = as.POSIXct(strptime("03-27/2022", format = "Y%-%m-%d")))
       ignore_attr = "tzone"
     )
   ```
   
   Along with a comment explaining what's going on?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg commented on a change in pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg commented on a change in pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#discussion_r838306983



##########
File path: r/tests/testthat/test-dplyr-funcs-datetime.R
##########
@@ -118,6 +118,51 @@ test_that("errors in strptime", {
   )
 })
 
+test_that("strptime returns NA when format doesn't match the data", {
+  df <- tibble(
+    str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07")
+  )
+
+  expect_equal(
+    df %>%
+      arrow_table() %>%
+      mutate(
+        r_obj_parsed_date = strptime("03-27/2022", format = "%m-%d/%Y"),
+        r_obj_parsed_na = strptime("03-27/2022", format = "Y%-%m-%d")) %>%
+      collect(),
+    tibble(
+      str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07"),
+      r_obj_parsed_date = as.POSIXct(rep("2022-03-27", 4)),
+      r_obj_parsed_na = as.POSIXct(rep(NA, 4))
+    ),
+    ignore_attr = "tzone"
+  )

Review comment:
       Sure, makes a lot of sense (given what `compare_dplyr_binding()` does. I've updated the tests. 1 of 3 uses `record_batch()` and 2 (of 3) use `arrow_table()`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg commented on a change in pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg commented on a change in pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#discussion_r837852972



##########
File path: r/tests/testthat/test-dplyr-funcs-datetime.R
##########
@@ -118,6 +118,51 @@ test_that("errors in strptime", {
   )
 })
 
+test_that("strptime returns NA when format doesn't match the data", {
+  df <- tibble(
+    str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07")
+  )
+
+  expect_equal(
+    df %>%
+      arrow_table() %>%
+      mutate(
+        r_obj_parsed_date = strptime("03-27/2022", format = "%m-%d/%Y"),
+        r_obj_parsed_na = strptime("03-27/2022", format = "Y%-%m-%d")) %>%
+      collect(),
+    tibble(
+      str_date = c("2022-02-07", "2012/02-07", "1975/01-02", "1981/01-07"),
+      r_obj_parsed_date = as.POSIXct(rep("2022-03-27", 4)),
+      r_obj_parsed_na = as.POSIXct(rep(NA, 4))
+    ),
+    ignore_attr = "tzone"
+  )

Review comment:
       Nope, `base::strptime()` and the arrow binding return different types. `ignore_attr` cannot help here since that covers only classes, not types. `base::strptime()` returns a list (`POSIXlt`), while the Arrow binding returns a `double` (`POSIXct`) vector.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/419| `ba04e7f8` ec2-t3-xlarge-us-east-2>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/415| `ba04e7f8` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/418| `64560af6` ec2-t3-xlarge-us-east-2>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/404| `64560af6` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/414| `64560af6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1080689370


   > @jonkeane what do you think? Is the type of the object returned by strptime of concern? base::strptime() returns a list (POSIXlt), while the Arrow one returns an atomic vector. We hadn't caught this before as the unit tests for strptime did not compare the {dplyr} and {arrow} pipelines.
   
   There are a few comments about this in the tests that reference this — I don't think it was unknown. POSIXlt is pretty unique to R (I'm sure there were other languages that have similar types), but they aren't super wide-spread, and Arrow doesn't support them directly (though you can see in the R package we make a (pseudo)-extension class for them).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/419| `ba04e7f8` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/418| `64560af6` ec2-t3-xlarge-us-east-2>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/414| `64560af6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Finished :arrow_down:0.21% :arrow_up:0.04%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Finished :arrow_down:0.04% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/419| `ba04e7f8` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/405| `ba04e7f8` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/415| `ba04e7f8` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/418| `64560af6` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/404| `64560af6` test-mac-arm>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/404| `64560af6` ursa-i9-9960x>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/414| `64560af6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Finished :arrow_down:0.21% :arrow_up:0.04%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Failed :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Finished :arrow_down:0.04% :arrow_up:0.0%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/419| `ba04e7f8` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/405| `ba04e7f8` test-mac-arm>
   [Failed] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/405| `ba04e7f8` ursa-i9-9960x>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/415| `ba04e7f8` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/418| `64560af6` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/404| `64560af6` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/404| `64560af6` ursa-i9-9960x>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/414| `64560af6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg commented on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg commented on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1080796317


   This needs a rebase once #12732 gets merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1078776962


   I think several things are needed:
   
   - [ ] new unit tests to show similar behaviour (returning `NA` in the same circumstances)
   - [ ] update existing unit tests. These are not using `compare_dplyr_binding()` and are not catching the fact that Arrow's `strptime` (`POSIXct`, a double vector) binding returns a different class that `base::strptime()` (`POSIXlt`, a list)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1078776962


   I think several things are needed:
   
   - [ ] new unit tests to show similar behaviour (returning `NA` in the same circumstances)
   - [ ] update existing unit tests. These are not using `compare_dplyr_binding()` and are not catching the fact that Arrow's `strptime` (`POSIXct`) binding returns a different class that `base::strptime()` (`POSIXlt`)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg commented on a change in pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg commented on a change in pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#discussion_r836560516



##########
File path: r/R/dplyr-funcs-datetime.R
##########
@@ -31,7 +31,7 @@ register_bindings_datetime <- function() {
 
     unit <- make_valid_time_unit(unit, c(valid_time64_units, valid_time32_units))
 
-    Expression$create("strptime", x, options = list(format = format, unit = unit))
+    build_expr("strptime", x, options = list(format = format, unit = unit, error_is_null = TRUE))
   })

Review comment:
       @jonkeane what do you think: should we expose `error_is_null` as argument? It might prove useful. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg commented on a change in pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg commented on a change in pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#discussion_r836793198



##########
File path: r/R/dplyr-funcs-datetime.R
##########
@@ -31,7 +31,7 @@ register_bindings_datetime <- function() {
 
     unit <- make_valid_time_unit(unit, c(valid_time64_units, valid_time32_units))
 
-    Expression$create("strptime", x, options = list(format = format, unit = unit))
+    build_expr("strptime", x, options = list(format = format, unit = unit, error_is_null = TRUE))
   })

Review comment:
       For a moment I forgot we can always call the arrow function if we need it. So 100% agree with you that we do not need to expose the arg to end users.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] dragosmg edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
dragosmg edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1078776962


   I think several things are needed:
   
   - [ ] new unit tests to show similar behaviour (returning `NA` in the same circumstances)
   - [ ] update existing unit tests. These are not using `compare_dplyr_binding()` and are not catching the fact that Arrow's `strptime` binding returns an object of a different class (`POSIXct`, a double vector) that `base::strptime()` (`POSIXlt`, a list)
        
    `waldo::compare()` does ignore `class` when `ignore_attr = TRUE`, but it can't ignore `typeof`. I don't think we want to change the type of object Arrow's `strptime` binding returns. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/419| `ba04e7f8` ec2-t3-xlarge-us-east-2>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/405| `ba04e7f8` test-mac-arm>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/415| `ba04e7f8` ursa-thinkcentre-m75q>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/418| `64560af6` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/404| `64560af6` test-mac-arm>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/414| `64560af6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #12402: ARROW-15659 [R] strptime should return NA (not error) with format mismatch

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #12402:
URL: https://github.com/apache/arrow/pull/12402#issuecomment-1083179426


   Benchmark runs are scheduled for baseline = 64560af62391cb8823b3af3ac60991c0de59bac3 and contender = ba04e7f8573ffd42442ffa589e4cbe61256ece7b. ba04e7f8573ffd42442ffa589e4cbe61256ece7b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/2ea90f0401e44582ac091285c82cb965...02f4a1b7f81647fbb82c571eb46fc581/)
   [Scheduled] [test-mac-arm](https://conbench.ursa.dev/compare/runs/64e85f71a39e4149bd46ed1464838489...708d93d7782d4ee297935b888dff5661/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/86dbc79ccb4448f282852e653ac6dc9c...9b30807d47774a6cb57a55d7f114c33c/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/98caf188c9594c74b23162913d95aba3...e3c219bb41ad477a9ad0b13a5da26bfb/)
   Buildkite builds:
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/419| `ba04e7f8` ec2-t3-xlarge-us-east-2>
   [Finished] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/418| `64560af6` ec2-t3-xlarge-us-east-2>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/404| `64560af6` test-mac-arm>
   [Scheduled] <https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/414| `64560af6` ursa-thinkcentre-m75q>
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org