You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Charlie Gao (Jira)" <ji...@apache.org> on 2021/08/09 16:15:00 UTC
[jira] [Created] (ARROW-13588) R empty character attributes not
stored
Charlie Gao created ARROW-13588:
-----------------------------------
Summary: R empty character attributes not stored
Key: ARROW-13588
URL: https://issues.apache.org/jira/browse/ARROW-13588
Project: Apache Arrow
Issue Type: Bug
Components: R
Affects Versions: 5.0.0
Environment: Ubuntu 20.04 R 4.1 release
Reporter: Charlie Gao
I have come across an issue in the process of incorporating arrow in a package I develop.
Date-times in the POSIXct format have a 'tzone' attribute that by default is set to "", an empty character vector (not NULL) when created.
This however is not stored in the Arrow feather file. When the file is read back, the original and restored dataframes are not identical as per the below reprex.
I am thinking that this should not be the intention? My workaround at the moment is making a check when reading back to write the empty string if the tzone attribute does not exist.
Just to confirm, this is not an issue when the attribute is not empty - it gets stored correctly.
Thanks.
``` r
dates <- as.POSIXct(c("2020-01-01", "2020-01-02", "2020-01-02"))
attributes(dates)
#> $class
#> [1] "POSIXct" "POSIXt"
#>
#> $tzone
#> [1] ""
values <- c(1:3)
original <- data.frame(dates, values)
original
#> dates values
#> 1 2020-01-01 1
#> 2 2020-01-02 2
#> 3 2020-01-02 3
tempfile <- tempfile()
arrow::write_feather(original, tempfile)
restored <- arrow::read_feather(tempfile)
identical(original, restored)
#> [1] FALSE
waldo::compare(original, restored)
#> `attr(old$dates, 'tzone')` is a character vector ('')
#> `attr(new$dates, 'tzone')` is absent
unlink(tempfile)
```
--
This message was sent by Atlassian Jira
(v8.3.4#803005)