You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Mauricio 'Pachá' Vargas Sepúlveda (Jira)" <ji...@apache.org> on 2021/06/07 16:18:00 UTC

[jira] [Created] (ARROW-12994) [R] stringr tests: 4 hours of difference between arrow and strptime

Mauricio 'Pachá' Vargas Sepúlveda created ARROW-12994:
---------------------------------------------------------

             Summary: [R] stringr tests: 4 hours of difference between arrow and strptime
                 Key: ARROW-12994
                 URL: https://issues.apache.org/jira/browse/ARROW-12994
             Project: Apache Arrow
          Issue Type: Task
          Components: R
    Affects Versions: 4.0.1
            Reporter: Mauricio 'Pachá' Vargas Sepúlveda


Here's the problem I detected while  triaging tickets. 

This was run locally after merging from apache/arrow at commit 8773b9d and re-building both Arrow library and Arrow R package.

``` r
library(arrow)
#> See arrow_info() for available features
#> 
#> Attaching package: 'arrow'
#> The following object is masked from 'package:utils':
#> 
#>     timestamp
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(testthat)
#> 
#> Attaching package: 'testthat'
#> The following object is masked from 'package:dplyr':
#> 
#>     matches
#> The following object is masked from 'package:arrow':
#> 
#>     matches

tstring <- tibble(x = c("08-05-2008", NA))
tstamp <- tibble(x = c(strptime("08-05-2008", format = "%m-%d-%Y"), NA))

expect_equal(
  tstring %>%
    Table$create() %>%
    mutate(
      x = strptime(x, format = "%m-%d-%Y")
    ) %>%
    collect(),
  tstamp,
  check.tzone = FALSE
)
#> Error: `%>%`(...) not equal to `tstamp`.
#> Component "x": Mean absolute difference: 14400
```

We can see that the dates are different by exact 4 hours by removing the expectation:

``` r
library(arrow)
#> See arrow_info() for available features
#> 
#> Attaching package: 'arrow'
#> The following object is masked from 'package:utils':
#> 
#>     timestamp
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(testthat)
#> 
#> Attaching package: 'testthat'
#> The following object is masked from 'package:dplyr':
#> 
#>     matches
#> The following object is masked from 'package:arrow':
#> 
#>     matches

tstring <- tibble(x = c("08-05-2008", NA))
tstamp <- tibble(x = c(strptime("08-05-2008", format = "%m-%d-%Y"), NA))

tstring %>%
  Table$create() %>%
  mutate(
    x = strptime(x, format = "%m-%d-%Y")
  ) %>%
  collect()
#> # A tibble: 2 x 1
#>   x                  
#>   <dttm>             
#> 1 2008-08-04 20:00:00
#> 2 NA

tstamp
#> # A tibble: 2 x 1
#>   x                  
#>   <dttm>             
#> 1 2008-08-05 00:00:00
#> 2 NA
```

<sup>Created on 2021-06-07 by the [reprex package](https://reprex.tidyverse.org) (v2.0.0)</sup>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)