You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/03 12:40:32 UTC

[GitHub] [arrow] jonkeane commented on a diff in pull request #12589: ARROW-14848: [R] Implement bindings for lubridate's parse_date_time

jonkeane commented on code in PR #12589:
URL: https://github.com/apache/arrow/pull/12589#discussion_r863731609


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -1471,3 +1471,128 @@ test_that("make_difftime()", {
     )
   )
 })
+
+test_that("parse_date_time()", {
+  test_df <- tibble(
+    dates = c("09-01-17", "02-Sep-17")
+  )
+
+  test_dates <- tibble::tibble(
+    string_ymd = c(
+      "2021-09-1", "2021/09///2", "2021.09.03", "2021,09,4", "2021:09::5",
+      "2021 09   6", "21-09-07", "21/09/08", "21.09.9", "21,09,10", "21:09:11",
+      # not yet working for strings with no separators, like "20210917", "210918" or "2021Sep19
+      # no separators and %b or %B are even more complicated (and they work in
+      # lubridate). not to mention locale
+      NA
+    ),
+    string_dmy = c(
+      "1-09-2021", "2/09//2021", "03.09.2021", "04,09,2021", "5:::09:2021",
+      "6  09  2021", "07-09-21", "08/09/21", "9.09.21", "10,09,21", "11:09:21",
+      # not yet working for strings with no separators, like "10092021", "100921",
+      NA
+    ),
+    string_mdy = c(
+      "09-01-2021", "09/2/2021", "09.3.2021", "09,04,2021", "09:05:2021",
+      "09 6 2021", "09-7-21", "09/08/21", "09.9.21", "09,10,21", "09:11:21",
+      # not yet working for strings with no separators, like "09102021", "091021",
+      NA
+    )
+  )
+
+  test_dates_locale <- tibble::tibble(
+    string_ymd = c(
+      "2021 Sep 12", "2021 September 13", "21 Sep 14", "21 September 15", NA
+    ),
+    string_dmy = c(
+      "12 Sep 2021", "13 September 2021", "14 Sep 21", "15 September 21", NA
+    ),
+    string_mdy = c(
+      "Sep 12 2021", "September 13 2021", "Sep 14 21", "September 15 21", NA
+    )
+  )
+
+  skip_if_not_available("re2")
+  compare_dplyr_binding(
+    .input %>%
+      mutate(
+        parsed_date_ymd = parse_date_time(string_ymd, orders = "ymd"),
+        parsed_date_dmy = parse_date_time(string_dmy, orders = "dmy"),
+        parsed_date_mdy = parse_date_time(string_mdy, orders = "mdy")
+      ) %>%
+      collect(),
+    test_dates
+  )
+
+  # locale not working properly on windows
+  # TODO revisit once https://issues.apache.org/jira/browse/ARROW-16399 is done
+  skip_on_os("windows")

Review Comment:
   Yes, sorry I was looking at the original failures that had values for both `test_dates` and `test_dates_locale` mixed together, but looking at this again you do test just `test_dates_locale` here which is good. Also 💯 on getting a Jira for this issue + linking it.
   
   And honestly — to make this even more obvious — since you only use `test_dates` and `test_dates_locale` you might consider using those definitions directly as the `tbl` argument for `compare_dplyr_binding`. That way one doesn't have to scroll up to see what's in either of them. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org