You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/05 17:00:51 UTC

[GitHub] [arrow] thisisnic commented on a diff in pull request #12757: ARROW-14944 [R] Implement `lubridate::make_difftime()`

thisisnic commented on code in PR #12757:
URL: https://github.com/apache/arrow/pull/12757#discussion_r843057311


##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -330,3 +349,33 @@ binding_format_datetime <- function(x, format = "", tz = "", usetz = FALSE) {
 
   build_expr("strftime", x, options = list(format = format, locale = Sys.getlocale("LC_TIME")))
 }
+
+duration_from_chunks <- function(chunks) {
+  accepted_chunks <- c("second", "minute", "hour", "day", "week")
+  matched_chunks <- accepted_chunks[pmatch(names(chunks), accepted_chunks, duplicates.ok = TRUE)]
+
+  if (any(is.na(matched_chunks))) {
+    abort(
+      paste0(
+        "Invalid `difftime` parts: ",

Review Comment:
   How about listing the accepted arguments here before we then tell the user which of theirs are not accepted?  Perhaps something like "named difftime time units must be one or more of..."



##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -330,3 +349,33 @@ binding_format_datetime <- function(x, format = "", tz = "", usetz = FALSE) {
 
   build_expr("strftime", x, options = list(format = format, locale = Sys.getlocale("LC_TIME")))
 }
+
+duration_from_chunks <- function(chunks) {
+  accepted_chunks <- c("second", "minute", "hour", "day", "week")
+  matched_chunks <- accepted_chunks[pmatch(names(chunks), accepted_chunks, duplicates.ok = TRUE)]
+
+  if (any(is.na(matched_chunks))) {
+    abort(
+      paste0(
+        "Invalid `difftime` parts: ",
+        oxford_paste(names(chunks[is.na(matched_chunks)]), quote_symbol = "`")
+      )
+    )
+  }
+
+  matched_chunks <- matched_chunks[!is.na(matched_chunks)]
+
+  chunks <- chunks[matched_chunks]
+  chunk_duration <- c(
+    "second" = 1L,
+    "minute" = 60L,
+    "hour" = 3600L,
+    "day" = 86400L,
+    "week" = 604800L
+  )
+  duration <- 0L
+  for (chunk in names(chunks)) {
+    duration <- duration + chunks[[chunk]] * chunk_duration[[chunk]]
+  }
+  duration

Review Comment:
   How about something like:
   ```
     chunks_total <- purrr::imap(chunks, ~.x * chunk_duration[[.y]]) %>%
       purrr::reduce(`+`)
   ```
   and then add that to `duration`.  In fact, do we still need the `duration` variable?



##########
r/R/dplyr-funcs-datetime.R:
##########
@@ -330,3 +349,33 @@ binding_format_datetime <- function(x, format = "", tz = "", usetz = FALSE) {
 
   build_expr("strftime", x, options = list(format = format, locale = Sys.getlocale("LC_TIME")))
 }
+
+duration_from_chunks <- function(chunks) {

Review Comment:
   Mind adding a comment (doesn't necessarily have to be roxygen header) above this just briefly explaining what it does?  From skimming the `lubridate` functionality and having a look at the Arrow docs, I think the reason for this function is the fact that in `lubridate` we can specify durations in different units (e.g. second/minute/hour etc) whereas in Arrow, these are just stored as seconds - is that right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org