You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/14 07:42:44 UTC

[GitHub] [arrow] rok commented on a change in pull request #10507: ARROW-13022: [R] bindings for lubridate's year, isoyear, quarter, month, day, wday, yday, isoweek, minute, and second functions

rok commented on a change in pull request #10507:
URL: https://github.com/apache/arrow/pull/10507#discussion_r649895508



##########
File path: r/R/dplyr-functions.R
##########
@@ -442,3 +442,37 @@ nse_funcs$strptime <- function(x, format = "%Y-%m-%d %H:%M:%S", tz = NULL, unit
 
   Expression$create("strptime", x, options = list(format = format, unit = unit))
 }
+
+nse_funcs$wday <- function(x, label = FALSE, abbr = TRUE, week_start = getOption("lubridate.week.start", 7)) {
+  if (label) {
+    arrow_not_supported("Label argument")
+  }
+  offset <- get_date_offset(week_start)
+  Expression$create("add", Expression$create("day_of_week", x), Expression$scalar(offset))
+}
+
+#' Get date offset
+#' 
+#' Arrow's `day_of_week` kernel counts from 0 (Monday) to 6 (Sunday), whereas
+#' `lubridate::wday` counts from 1 to 7, and allows users to specify which day
+#' of the week is first (Sunday by default).  This function converts the returned

Review comment:
       We could add options to the C++ kernel to enable different behaviors there.

##########
File path: r/R/dplyr-functions.R
##########
@@ -442,3 +442,37 @@ nse_funcs$strptime <- function(x, format = "%Y-%m-%d %H:%M:%S", tz = NULL, unit
 
   Expression$create("strptime", x, options = list(format = format, unit = unit))
 }
+
+nse_funcs$wday <- function(x, label = FALSE, abbr = TRUE, week_start = getOption("lubridate.week.start", 7)) {
+  if (label) {
+    arrow_not_supported("Label argument")
+  }
+  offset <- get_date_offset(week_start)
+  Expression$create("add", Expression$create("day_of_week", x), Expression$scalar(offset))
+}
+
+#' Get date offset
+#' 
+#' Arrow's `day_of_week` kernel counts from 0 (Monday) to 6 (Sunday), whereas
+#' `lubridate::wday` counts from 1 to 7, and allows users to specify which day
+#' of the week is first (Sunday by default).  This function converts the returned

Review comment:
       If we go that way it would probably be best to have another Jira for "TemporalOptions". It's probably best you proceed with the workaround and we loop back to this later.

##########
File path: r/R/expression.R
##########
@@ -28,8 +28,17 @@
   # stringr spellings of those
   "str_length" = "utf8_length",
   "str_to_lower" = "utf8_lower",
-  "str_to_upper" = "utf8_upper"
+  "str_to_upper" = "utf8_upper",
   # str_trim is defined in dplyr.R
+  "year" = "year",
+  "isoyear" = "iso_year",
+  "quarter" = "quarter",
+  "month" = "month",
+  "day" = "day",
+  "yday" = "day_of_year",
+  "isoweek" = "iso_week",
+  "minute" = "minute",
+  "second" = "second"

Review comment:
       What about nanoseconds?
   ```R
   > second(ymd_hms("2011-06-04 12:00:01.123456789"))
   [1] 1.123457
   ```
   
   Arrow would probably return `1.123456789`.

##########
File path: r/R/expression.R
##########
@@ -28,8 +28,17 @@
   # stringr spellings of those
   "str_length" = "utf8_length",
   "str_to_lower" = "utf8_lower",
-  "str_to_upper" = "utf8_upper"
+  "str_to_upper" = "utf8_upper",
   # str_trim is defined in dplyr.R
+  "year" = "year",
+  "isoyear" = "iso_year",
+  "quarter" = "quarter",
+  "month" = "month",
+  "day" = "day",
+  "yday" = "day_of_year",
+  "isoweek" = "iso_week",
+  "minute" = "minute",
+  "second" = "second"

Review comment:
       I think so too. So probably it should be "`second = second + round(subsecond, 6)`" to match that behaviour?

##########
File path: r/R/expression.R
##########
@@ -28,8 +28,17 @@
   # stringr spellings of those
   "str_length" = "utf8_length",
   "str_to_lower" = "utf8_lower",
-  "str_to_upper" = "utf8_upper"
+  "str_to_upper" = "utf8_upper",
   # str_trim is defined in dplyr.R
+  "year" = "year",
+  "isoyear" = "iso_year",
+  "quarter" = "quarter",
+  "month" = "month",
+  "day" = "day",
+  "yday" = "day_of_year",
+  "isoweek" = "iso_week",
+  "minute" = "minute",
+  "second" = "second"

Review comment:
       I think so too. So probably it should be "`second = second + round(subsecond, 6)`" to match that behavior?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org