You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Mauricio 'Pachá' Vargas Sepúlveda (Jira)" <ji...@apache.org> on 2021/06/07 16:18:00 UTC
[jira] [Created] (ARROW-12994) [R] stringr tests: 4 hours of
difference between arrow and strptime
Mauricio 'Pachá' Vargas Sepúlveda created ARROW-12994:
---------------------------------------------------------
Summary: [R] stringr tests: 4 hours of difference between arrow and strptime
Key: ARROW-12994
URL: https://issues.apache.org/jira/browse/ARROW-12994
Project: Apache Arrow
Issue Type: Task
Components: R
Affects Versions: 4.0.1
Reporter: Mauricio 'Pachá' Vargas Sepúlveda
Here's the problem I detected while triaging tickets.
This was run locally after merging from apache/arrow at commit 8773b9d and re-building both Arrow library and Arrow R package.
``` r
library(arrow)
#> See arrow_info() for available features
#>
#> Attaching package: 'arrow'
#> The following object is masked from 'package:utils':
#>
#> timestamp
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(testthat)
#>
#> Attaching package: 'testthat'
#> The following object is masked from 'package:dplyr':
#>
#> matches
#> The following object is masked from 'package:arrow':
#>
#> matches
tstring <- tibble(x = c("08-05-2008", NA))
tstamp <- tibble(x = c(strptime("08-05-2008", format = "%m-%d-%Y"), NA))
expect_equal(
tstring %>%
Table$create() %>%
mutate(
x = strptime(x, format = "%m-%d-%Y")
) %>%
collect(),
tstamp,
check.tzone = FALSE
)
#> Error: `%>%`(...) not equal to `tstamp`.
#> Component "x": Mean absolute difference: 14400
```
We can see that the dates are different by exact 4 hours by removing the expectation:
``` r
library(arrow)
#> See arrow_info() for available features
#>
#> Attaching package: 'arrow'
#> The following object is masked from 'package:utils':
#>
#> timestamp
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(testthat)
#>
#> Attaching package: 'testthat'
#> The following object is masked from 'package:dplyr':
#>
#> matches
#> The following object is masked from 'package:arrow':
#>
#> matches
tstring <- tibble(x = c("08-05-2008", NA))
tstamp <- tibble(x = c(strptime("08-05-2008", format = "%m-%d-%Y"), NA))
tstring %>%
Table$create() %>%
mutate(
x = strptime(x, format = "%m-%d-%Y")
) %>%
collect()
#> # A tibble: 2 x 1
#> x
#> <dttm>
#> 1 2008-08-04 20:00:00
#> 2 NA
tstamp
#> # A tibble: 2 x 1
#> x
#> <dttm>
#> 1 2008-08-05 00:00:00
#> 2 NA
```
<sup>Created on 2021-06-07 by the [reprex package](https://reprex.tidyverse.org) (v2.0.0)</sup>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)