You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/10 06:03:36 UTC

[GitHub] [arrow] djnavarro edited a comment on pull request #12154: ARROW-14821: [R] Implement bindings for lubridate's floor_date, ceiling_date, and round_date

djnavarro edited a comment on pull request #12154:
URL: https://github.com/apache/arrow/pull/12154#issuecomment-1063692453


   Maybe it's easiest to illustrate with actual R code. The behaviour I need to emulate with regards to the `change_on_boundary` argument looks different depending on whether the data are stored as a datetime (e.g. POSIXct in R, timestamp in Arrow) versus a Date (e.g., Date in R, date32 in Arrow):
   
   ```r
   # a datetime object at the midnight boundary, and a date object with no time
   midnight_time <- as.POSIXct(strptime("2022-03-10 00:00:00", format = "%F %T"))
   timeless_date <- as.Date("2022-03-10")
   
   # ceiling date applied to the datetime
   ceiling_date(midnight_time, unit = "day", change_on_boundary = NULL)  # returns the 10th
   ceiling_date(midnight_time, unit = "day", change_on_boundary = TRUE)  # returns the 11th
   ceiling_date(midnight_time, unit = "day", change_on_boundary = FALSE) # returns the 10th
   
   # ceiling date applied to the date
   ceiling_date(timeless_date, unit = "day", change_on_boundary = NULL)  # returns the 11th
   ceiling_date(timeless_date, unit = "day", change_on_boundary = TRUE)  # returns the 11th
   ceiling_date(timeless_date, unit = "day", change_on_boundary = FALSE) # returns the 10th
   ```
   
   This pattern only affects times and dates deemed to be on the boundary with respect to the rounding unit. Any time past midnight and `lubridate::ceiling_date()` would always return the 11th for the datetime object. 
   
   Currently, the `ceil_temporal` C++ function would return the 10th for the analogous cases. ~~If I can work out how to add +1 time unit to the boundary cases then I have no problems exactly mimicking lubridate behaviour. I can get this to work for timestamp objects. This code works, for instance, and adds 1 day to the Arrow timestamp that corresponds to `midnight_time`:~~
   
   ```r
   call_function("add_checked", Scalar$create(midnight_time), Scalar$create(as.difftime("24:00:00")))
   ```
   
   ~~But when I try the same thing for the date, however, it throws this error:~~
   
   ```
   Error: NotImplemented: Function 'add_checked' has no kernel matching input types (scalar[date32[day]], scalar[duration[s]])
   ```
   
   ~~That seems odd to me and I'm not sure why I'm getting that error because I'm pretty sure you implemented this in [ARROW-14901](https://github.com/apache/arrow/pull/12377). So I must be doing something wrong?~~
   
   EDIT: Never mind, I worked it out. 🤦🏻‍♀️  I can handle the `change_on_boundary` behaviour on the R side! 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org