You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "lgaborini (via GitHub)" <gi...@apache.org> on 2023/11/30 13:07:53 UTC
Re: [I] [R] arrow implementation of lubridate::dmy parses invalid date "00001976" as date [arrow]
lgaborini commented on issue #33425:
URL: https://github.com/apache/arrow/issues/33425#issuecomment-1833755499
Still open with 14.0.0.1 on Windows.
I have included more cases: some correctly fail, others do not.
``` r
library(arrow)
df <- data.frame(
d = c(
"11-00-2022",
"00-12-2022",
"11-13-2022",
"00-13-2022",
"32-10-2022"
)
)
# Base/lubridate R
df$d |> lubridate::dmy()
#> Warning: All formats failed to parse. No formats found.
#> [1] NA NA NA NA NA
df$d |> strptime("%d-%m-%Y")
#> [1] NA NA NA NA NA
df$d |> lubridate::parse_date_time("dmY")
#> Warning: All formats failed to parse. No formats found.
#> [1] NA NA NA NA NA
df$d |> lubridate::parse_date_time("dmY", truncated = 0)
#> Warning: All formats failed to parse. No formats found.
#> [1] NA NA NA NA NA
dt <- df |>
arrow::arrow_table()
dt |> dplyr::collect()
#> # A tibble: 5 × 1
#> d
#> <chr>
#> 1 11-00-2022
#> 2 00-12-2022
#> 3 11-13-2022
#> 4 00-13-2022
#> 5 32-10-2022
dt |>
dplyr::mutate(
dt_1 = strptime(d, "%d-%m-%Y"),
dt_2 = dmy(d),
dt_3 = parse_date_time(d, "%d-%m-%Y", truncated = 0),
dt_4 = parse_date_time(d, "dmY"),
) |>
dplyr::collect()
#> # A tibble: 5 × 5
#> d dt_1 dt_2 dt_3 dt_4
#> <chr> <dttm> <date> <dttm> <dttm>
#> 1 11-00-… 2021-12-11 00:00:00 2021-12-11 2021-12-11 00:00:00 2021-12-11 00:00:00
#> 2 00-12-… 2022-12-01 00:00:00 2022-12-01 2022-12-01 00:00:00 2022-12-01 00:00:00
#> 3 11-13-… NA NA NA NA
#> 4 00-13-… NA NA NA NA
#> 5 32-10-… NA NA NA NA
arrow_table(x = '00001976') |>
dplyr::mutate(y = dmy(x)) |>
dplyr::collect()
#> # A tibble: 1 × 2
#> x y
#> <chr> <date>
#> 1 00001976 1975-12-01
```
<sup>Created on 2023-11-30 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org