You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/16 10:26:42 UTC

[GitHub] [arrow] dragosmg opened a new pull request, #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

dragosmg opened a new pull request, #13163:
URL: https://github.com/apache/arrow/pull/13163

   The `ym()`, `my()` and `yq()` bindings will make the following possible (and identical):
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   library(dplyr, warn.conflicts = FALSE)
   library(lubridate, warn.conflicts = FALSE)
   
   test_df <- tibble::tibble(
     ym_string = c("2022-05", "2022/02", "22.03", NA)
   )
   
   test_df %>% 
     mutate(ym_date = ym(ym_string))
   #> # A tibble: 4 × 2
   #>   ym_string ym_date   
   #>   <chr>     <date>    
   #> 1 2022-05   2022-05-01
   #> 2 2022/02   2022-02-01
   #> 3 22.03     2022-03-01
   #> 4 <NA>      NA
   
   test_df %>% 
     arrow_table() %>% 
     mutate(ym_date = ym(ym_string)) %>% 
     collect()
   #> # A tibble: 4 × 2
   #>   ym_string ym_date   
   #>   <chr>     <date>    
   #> 1 2022-05   2022-05-01
   #> 2 2022/02   2022-02-01
   #> 3 22.03     2022-03-01
   #> 4 <NA>      NA
   ```
   
   <sup>Created on 2022-05-16 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup> 
   
   I've implementing this with the following steps:
   * add `"-01"` to the end of the strings we're trying to parse, and then
   * use one the supported `orders` (`"ymd"` or `"myd"`)  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r874765215


##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -493,11 +493,29 @@ register_bindings_datetime_parsers <- function() {
     # each order is translated into possible formats
     formats <- build_formats(orders)
 
+    x <- x$cast(string())
+
     # make all separators (non-letters and non-numbers) into "-"
     x <- call_binding("gsub", "[^A-Za-z0-9]", "-", x)
     # collapse multiple separators into a single one
     x <- call_binding("gsub", "-{2,}", "-", x)
 
+    # add a day (01) for "ym" and "my" orders
+    augmented_x <- NULL
+    if (any(orders %in% c("ym", "my"))) {
+      augmented_x <- call_binding("paste0", x, "-01")
+    }
+
+    augmented_x2 <- NULL
+    if (any(orders == "yq")) {
+      quarter_x <- call_binding("gsub", "^.*?-", "", x)

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r874758408


##########
r/R/dplyr-datetime-helpers.R:
##########
@@ -159,6 +159,21 @@ build_formats <- function(orders) {
   orders <- gsub("[^A-Za-z_]", "", orders)
   orders <- gsub("Y", "y", orders)
 
+  short_orders <- c("ym", "my")
+
+  if (any(orders %in% short_orders)) {
+    orders1 <- setdiff(orders, short_orders)
+    orders2 <- intersect(orders, short_orders)
+    orders2 <- paste0(orders2, "d")
+    orders <- unique(c(orders1, orders2))
+  }
+
+  if (any(orders == "yq")) {
+    orders1 <- setdiff(orders, "yq")
+    orders2 <- "ymd"

Review Comment:
   Done. Added a longer comment that covers both `yq` + `ym` & `my`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
thisisnic commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r874979021


##########
r/R/dplyr-datetime-helpers.R:
##########
@@ -159,6 +159,27 @@ build_formats <- function(orders) {
   orders <- gsub("[^A-Za-z_]", "", orders)
   orders <- gsub("Y", "y", orders)
 
+  # we need a different logic in order to deal with "ym', "my", and "yq" orders
+  # we separate them from the rest of the `orders` vector and transform them.
+  # `ym` and `yq` become `ymd` & `my` becomes `myd`
+  # this is needed because strptime does not parse "2022-05", so we add "-01",
+  # thus changing the format, and for equivalence with lubridate, which parses
+  # `ym` to the first day of the month

Review Comment:
   This is great and explains things well - I might consider cutting out a few words in the name of brevity but happy to approve with or without changing it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
ursabot commented on PR #13163:
URL: https://github.com/apache/arrow/pull/13163#issuecomment-1130272004

   Benchmark runs are scheduled for baseline = 0742f78a27fa9d7882c7a840c117824e03bee82d and contender = 60f6caf9b19d145757baac553506d150720728b2. 60f6caf9b19d145757baac553506d150720728b2 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/30ea12972bbb424ca04658a9c68c0f6c...ed2cd8bd3ea04f7da50c5f1ad58ad548/)
   [Failed :arrow_down:2.06% :arrow_up:1.48%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/5c515d2bbb724ce79d2514dbf35b3e55...c9e9e2ed081d427eb6e2fc91ffaeb9a6/)
   [Failed :arrow_down:0.0% :arrow_up:11.03%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/8fb73ad102f44912b285123fe1017e74...9bcd562baeb447acad82d2ab974687a6/)
   [Finished :arrow_down:1.38% :arrow_up:0.28%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/fa449ef4e5254addb8e8b37f2540bcaf...8238b3d1a06246e1b914d3674841b923/)
   Buildkite builds:
   [Finished] [`60f6caf9` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/784)
   [Failed] [`60f6caf9` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/781)
   [Failed] [`60f6caf9` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/771)
   [Finished] [`60f6caf9` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/787)
   [Finished] [`0742f78a` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/783)
   [Failed] [`0742f78a` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/780)
   [Failed] [`0742f78a` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/770)
   [Finished] [`0742f78a` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/786)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic closed pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
thisisnic closed pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers
URL: https://github.com/apache/arrow/pull/13163


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r875206246


##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -514,6 +539,41 @@ register_bindings_datetime_parsers <- function() {
       )
     }
 
+    # build separate expression lists of parsing attempts for the orders that
+    # need an augmented `x`
+    # list for attempts when orders %in% c("ym", "my")
+    parse_attempt_exp_augmented_x <- list()
+
+    if (!is.null(augmented_x)) {
+      for (i in seq_along(formats)) {

Review Comment:
   Done. `purrr::map()` was sufficient. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r874990628


##########
r/R/dplyr-datetime-helpers.R:
##########
@@ -159,6 +159,27 @@ build_formats <- function(orders) {
   orders <- gsub("[^A-Za-z_]", "", orders)
   orders <- gsub("Y", "y", orders)
 
+  # we need a different logic in order to deal with "ym', "my", and "yq" orders
+  # we separate them from the rest of the `orders` vector and transform them.
+  # `ym` and `yq` become `ymd` & `my` becomes `myd`
+  # this is needed because strptime does not parse "2022-05", so we add "-01",
+  # thus changing the format, and for equivalence with lubridate, which parses
+  # `ym` to the first day of the month

Review Comment:
   I'll have a look later and try to make it a bit more concise. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
thisisnic commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r874578997


##########
r/R/dplyr-datetime-helpers.R:
##########
@@ -176,7 +191,8 @@ build_formats <- function(orders) {
   }
 
   formats_list <- map(orders, build_format_from_order)
-  purrr::flatten_chr(formats_list)
+  formats <- purrr::flatten_chr(formats_list)

Review Comment:
   Ooh, I didn't know `purrr` had this function, nice!



##########
r/R/dplyr-datetime-helpers.R:
##########
@@ -159,6 +159,21 @@ build_formats <- function(orders) {
   orders <- gsub("[^A-Za-z_]", "", orders)
   orders <- gsub("Y", "y", orders)
 
+  short_orders <- c("ym", "my")
+
+  if (any(orders %in% short_orders)) {
+    orders1 <- setdiff(orders, short_orders)
+    orders2 <- intersect(orders, short_orders)
+    orders2 <- paste0(orders2, "d")
+    orders <- unique(c(orders1, orders2))
+  }
+
+  if (any(orders == "yq")) {
+    orders1 <- setdiff(orders, "yq")
+    orders2 <- "ymd"

Review Comment:
   Again, please add a comment here.



##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -493,11 +493,29 @@ register_bindings_datetime_parsers <- function() {
     # each order is translated into possible formats
     formats <- build_formats(orders)
 
+    x <- x$cast(string())
+
     # make all separators (non-letters and non-numbers) into "-"
     x <- call_binding("gsub", "[^A-Za-z0-9]", "-", x)
     # collapse multiple separators into a single one
     x <- call_binding("gsub", "-{2,}", "-", x)
 
+    # add a day (01) for "ym" and "my" orders
+    augmented_x <- NULL
+    if (any(orders %in% c("ym", "my"))) {
+      augmented_x <- call_binding("paste0", x, "-01")
+    }
+
+    augmented_x2 <- NULL
+    if (any(orders == "yq")) {
+      quarter_x <- call_binding("gsub", "^.*?-", "", x)

Review Comment:
   Please add a brief comment explaining this regex



##########
r/R/dplyr-datetime-helpers.R:
##########
@@ -159,6 +159,21 @@ build_formats <- function(orders) {
   orders <- gsub("[^A-Za-z_]", "", orders)
   orders <- gsub("Y", "y", orders)
 
+  short_orders <- c("ym", "my")
+
+  if (any(orders %in% short_orders)) {

Review Comment:
   Please can you add a brief comment summarising what this block of code is doing?



##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -514,6 +532,36 @@ register_bindings_datetime_parsers <- function() {
       )
     }
 
+    parse_attempt_exp_augmented_x <- list()
+
+    if (!is.null(augmented_x)) {
+      for (i in seq_along(formats)) {
+        parse_attempt_expressions[[i]] <- build_expr(
+          "strptime",
+          augmented_x,
+          options = list(format = formats[[i]], unit = 0L, error_is_null = TRUE)
+        )
+      }
+    }
+
+    parse_attempt_exp_augmented_x2 <- list()
+
+    if (!is.null(augmented_x2)) {
+      for (i in seq_along(formats)) {
+        parse_attempt_expressions[[i]] <- build_expr(
+          "strptime",
+          augmented_x2,
+          options = list(format = formats[[i]], unit = 0L, error_is_null = TRUE)
+        )
+      }
+    }
+
+    parse_attempt_expressions <- c(

Review Comment:
   Brief comment here too wouldn't go amiss



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r875195162


##########
r/R/dplyr-datetime-helpers.R:
##########
@@ -159,6 +159,27 @@ build_formats <- function(orders) {
   orders <- gsub("[^A-Za-z_]", "", orders)
   orders <- gsub("Y", "y", orders)
 
+  # we need a different logic in order to deal with "ym', "my", and "yq" orders
+  # we separate them from the rest of the `orders` vector and transform them.
+  # `ym` and `yq` become `ymd` & `my` becomes `myd`
+  # this is needed because strptime does not parse "2022-05", so we add "-01",
+  # thus changing the format, and for equivalence with lubridate, which parses
+  # `ym` to the first day of the month

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] thisisnic commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
thisisnic commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r874976069


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -1776,3 +1776,35 @@ test_that("year, month, day date/time parsers work", {
     test_df
   )
 })
+
+test_that("ym, my & yq parsers", {
+  test_df <- tibble::tibble(
+    ym_string = c("2022-05", "2022/02", "22.03", "1979//12", "88.09", NA),
+    my_string = c("05-2022", "02/2022", "03.22", "12//1979", "09.88", NA),
+    yq_string = c("2007.3", "1970.2", "2020.1", "2009.4", "1975.1", NA),
+    yq_numeric = c(2007.3, 1970.2, 2020.1, 2009.4, 1975.1, NA),
+  )

Review Comment:
   Would this also work with 
   `yq_space = c("2007 3", "1970 2", "2020 1", "2009 4", "1975 1", NA)`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r874779869


##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -514,6 +532,36 @@ register_bindings_datetime_parsers <- function() {
       )
     }
 
+    parse_attempt_exp_augmented_x <- list()
+
+    if (!is.null(augmented_x)) {
+      for (i in seq_along(formats)) {
+        parse_attempt_expressions[[i]] <- build_expr(
+          "strptime",
+          augmented_x,
+          options = list(format = formats[[i]], unit = 0L, error_is_null = TRUE)
+        )
+      }
+    }
+
+    parse_attempt_exp_augmented_x2 <- list()
+
+    if (!is.null(augmented_x2)) {
+      for (i in seq_along(formats)) {
+        parse_attempt_expressions[[i]] <- build_expr(
+          "strptime",
+          augmented_x2,
+          options = list(format = formats[[i]], unit = 0L, error_is_null = TRUE)
+        )
+      }
+    }
+
+    parse_attempt_expressions <- c(

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13163:
URL: https://github.com/apache/arrow/pull/13163#issuecomment-1127518827

   https://issues.apache.org/jira/browse/ARROW-16516


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r875206692


##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -514,6 +539,41 @@ register_bindings_datetime_parsers <- function() {
       )
     }
 
+    # build separate expression lists of parsing attempts for the orders that
+    # need an augmented `x`
+    # list for attempts when orders %in% c("ym", "my")
+    parse_attempt_exp_augmented_x <- list()
+
+    if (!is.null(augmented_x)) {
+      for (i in seq_along(formats)) {
+        parse_attempt_expressions[[i]] <- build_expr(
+          "strptime",
+          augmented_x,
+          options = list(format = formats[[i]], unit = 0L, error_is_null = TRUE)
+        )
+      }
+    }
+
+    # list for attempts when orders %in% c("yq")
+    parse_attempt_exp_augmented_x2 <- list()
+
+    if (!is.null(augmented_x2)) {
+      for (i in seq_along(formats)) {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r874981780


##########
r/tests/testthat/test-dplyr-funcs-datetime.R:
##########
@@ -1776,3 +1776,35 @@ test_that("year, month, day date/time parsers work", {
     test_df
   )
 })
+
+test_that("ym, my & yq parsers", {
+  test_df <- tibble::tibble(
+    ym_string = c("2022-05", "2022/02", "22.03", "1979//12", "88.09", NA),
+    my_string = c("05-2022", "02/2022", "03.22", "12//1979", "09.88", NA),
+    yq_string = c("2007.3", "1970.2", "2020.1", "2009.4", "1975.1", NA),
+    yq_numeric = c(2007.3, 1970.2, 2020.1, 2009.4, 1975.1, NA),
+  )

Review Comment:
   Yep. Good catch. I've just added some more unit tests.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
ursabot commented on PR #13163:
URL: https://github.com/apache/arrow/pull/13163#issuecomment-1130272282

   ['Python', 'R'] benchmarks have high level of regressions.
   [test-mac-arm](https://conbench.ursa.dev/compare/runs/5c515d2bbb724ce79d2514dbf35b3e55...c9e9e2ed081d427eb6e2fc91ffaeb9a6/)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r874752193


##########
r/R/dplyr-datetime-helpers.R:
##########
@@ -176,7 +191,8 @@ build_formats <- function(orders) {
   }
 
   formats_list <- map(orders, build_format_from_order)
-  purrr::flatten_chr(formats_list)
+  formats <- purrr::flatten_chr(formats_list)

Review Comment:
   😉 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] dragosmg commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
dragosmg commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r874756865


##########
r/R/dplyr-datetime-helpers.R:
##########
@@ -159,6 +159,21 @@ build_formats <- function(orders) {
   orders <- gsub("[^A-Za-z_]", "", orders)
   orders <- gsub("Y", "y", orders)
 
+  short_orders <- c("ym", "my")
+
+  if (any(orders %in% short_orders)) {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on a diff in pull request #13163: ARROW-16516: [R] Implement ym() my() and yq() parsers

Posted by GitBox <gi...@apache.org>.
paleolimbot commented on code in PR #13163:
URL: https://github.com/apache/arrow/pull/13163#discussion_r875112056


##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -514,6 +539,41 @@ register_bindings_datetime_parsers <- function() {
       )
     }
 
+    # build separate expression lists of parsing attempts for the orders that
+    # need an augmented `x`
+    # list for attempts when orders %in% c("ym", "my")
+    parse_attempt_exp_augmented_x <- list()
+
+    if (!is.null(augmented_x)) {
+      for (i in seq_along(formats)) {
+        parse_attempt_expressions[[i]] <- build_expr(
+          "strptime",
+          augmented_x,
+          options = list(format = formats[[i]], unit = 0L, error_is_null = TRUE)
+        )
+      }
+    }
+
+    # list for attempts when orders %in% c("yq")
+    parse_attempt_exp_augmented_x2 <- list()
+
+    if (!is.null(augmented_x2)) {
+      for (i in seq_along(formats)) {

Review Comment:
   See above...I think you probably want `lapply(seq_along)` or `imap()` to make this consistent with how we do this elsewhere.



##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -514,6 +539,41 @@ register_bindings_datetime_parsers <- function() {
       )
     }
 
+    # build separate expression lists of parsing attempts for the orders that
+    # need an augmented `x`
+    # list for attempts when orders %in% c("ym", "my")
+    parse_attempt_exp_augmented_x <- list()
+
+    if (!is.null(augmented_x)) {
+      for (i in seq_along(formats)) {

Review Comment:
   We rarely use this pattern elsewhere in the package...if you need both the index and the value you could use `imap()` since we import most of purrr.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org