You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Rok Mihevc (Jira)" <ji...@apache.org> on 2022/03/24 15:53:00 UTC

[jira] [Commented] (ARROW-15659) [R] strptime should return NA (not error) with format mismatch

    [ https://issues.apache.org/jira/browse/ARROW-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511926#comment-17511926 ] 

Rok Mihevc commented on ARROW-15659:
------------------------------------

ARROW-15665 was just solved. We [added error_is_null|https://github.com/apache/arrow/pull/12464/files#diff-a138b87f0da58d824c72293eab62d9f352718dcb7f41a47ea7e1c3c84bbe27dd] option to Strptime kernel and you probably just want to add tests for it now.

> [R] strptime should return NA (not error) with format mismatch 
> ---------------------------------------------------------------
>
>                 Key: ARROW-15659
>                 URL: https://issues.apache.org/jira/browse/ARROW-15659
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>            Reporter: Dragoș Moldovan-Grünfeld
>            Assignee: Dragoș Moldovan-Grünfeld
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{base::strptime()}} returns {{NA}} when the value passed to the {{format}} argument does not match the string to be parsed. The arrow binding currently errors in the same scenario. 
> {code:r}
> strptime("2022-02-11", format = "%Y-%m-%d")
> #> [1] "2022-02-11 GMT"
> strptime("2022-02-11", format = "%Y %m-%d")
> #> [1] NA
> {code}
> {code:r}
> suppressMessages(library(lubridate))
> suppressMessages(library(arrow))
> suppressMessages(library(dplyr))
> df <- tibble(x = "2022-02-11")
> df %>% 
>   mutate(z = strptime(x, format = "%Y-%m %d"))
> #> # A tibble: 1 × 2
> #>   x          z     
> #>   <chr>      <dttm>
> #> 1 2022-02-11 NA
> df %>% 
>   record_batch() %>% 
>   mutate(z = strptime(x, format = "%Y-%m %d")) %>% 
>   collect()
> #> Error: Invalid: Failed to parse string: '2022-02-11' as a scalar of type timestamp[ms]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)