You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Dragoș Moldovan-Grünfeld (Jira)" <ji...@apache.org> on 2022/04/25 13:27:00 UTC

[jira] [Commented] (ARROW-16316) [R] How to round the timestamps in a mutate statement?

    [ https://issues.apache.org/jira/browse/ARROW-16316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17527494#comment-17527494 ] 

Dragoș Moldovan-Grünfeld commented on ARROW-16316:
--------------------------------------------------

Hi [~kbzsl] and thanks for opening this ticket. I'll try to reply to your questions:
 * you are right, we don't yet have a comprehensive list of the {{{}lubridate{}}}, {{dplyr}} and other {{tidyverse}} verbs supported in arrow (other than the [NEWS|https://github.com/apache/arrow/blob/master/r/NEWS.md]). It is definitely something that might be useful to have. I've just opened ARROW-16319 and I hope to cover that in time for release 9.0.0 (the release following the imminent one)
 * support for {{lubridate}} functionality in {{arrow}} is under active development - see ARROW-15163 and ARROW-16155. However, we haven't been able to close  {{floor_date}} ticket yet, and it might not get solved in time for the 8.0.0 release. floor_date turned out to be quite complex and there are currently 2 pull request ([PR 12657|https://github.com/apache/arrow/pull/12657] and [PR 12154|https://github.com/apache/arrow/pull/12154]) under review tackling different aspects of the functionality. Once these PRs get merged {{floor_date}} will be available in the development version of the package (hopefully soon).

> [R] How to round the timestamps in a mutate statement?
> ------------------------------------------------------
>
>                 Key: ARROW-16316
>                 URL: https://issues.apache.org/jira/browse/ARROW-16316
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: R
>    Affects Versions: 7.0.0
>            Reporter: Zsolt Kegyes-Brassai
>            Priority: Minor
>
> I was trying to aggregate over time using different granularity. Usually I would use the {{lubridate::floor_date()}} , which is currently not supported for parquet datasets.
> Is there any comprehensive list of supported list of currently supported {{{}lubridate (or dplyr{}}}) verbs? Maybe, it’s only my fault, but except the changelog I haven’t find any relevant information.
>  
> Later I found that the {{round_temporal()}} function is exposed to {{{}R{}}}. 
> But I am struggling to find the right syntax inside a mutate statement to apply on a {{timestamp[us, tz=UTC]}} type column.
> {code:java}
> new_dataset |>
>   mutate(time = arrow_round_temporal(time))
> #>  Error: Invalid: Attempted to initialize KernelState from null FunctionOptions
> {code}
>  
> Here are some other attempts:
> {code:java}
> library(arrow)
> arrow_now <- Scalar$create(lubridate::now())
> (arrow_now)
> #> Scalar
> #> 2022-04-25 11:44:33.805609
> call_function("round_temporal", arrow_now)
> #> Scalar
> #> 2022-04-25 00:00:00.000000
> call_function("round_temporal", arrow_now, unit = "day")
> #> Error: Argument 2 is of class character but it must be one of "Array", "ChunkedArray", "RecordBatch", "Table", or "Scalar"
> arrow_unit <- Scalar$create("day")
> (arrow_unit)
> #> Scalar
> #> day
> call_function("round_temporal", arrow_now, unit = arrow_unit)
> #> Error: Invalid: Function 'round_temporal' accepts 1 arguments but attempted to look up kernel(s) with 2
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)