You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2021/09/28 13:18:00 UTC
[jira] [Commented] (ARROW-13588) [R] Empty character attributes not
stored
[ https://issues.apache.org/jira/browse/ARROW-13588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17421380#comment-17421380 ]
Neal Richardson commented on ARROW-13588:
-----------------------------------------
Will resolve in ARROW-12871
> [R] Empty character attributes not stored
> -----------------------------------------
>
> Key: ARROW-13588
> URL: https://issues.apache.org/jira/browse/ARROW-13588
> Project: Apache Arrow
> Issue Type: Bug
> Components: R
> Affects Versions: 5.0.0
> Environment: Ubuntu 20.04 R 4.1 release
> Reporter: Charlie Gao
> Assignee: Neal Richardson
> Priority: Critical
> Labels: attributes, feather
> Fix For: 6.0.0
>
>
> Date-times in the POSIXct format have a 'tzone' attribute that by default is set to "", an empty character vector (not NULL) when created.
> This however is not stored in the Arrow feather file. When the file is read back, the original and restored dataframes are not identical as per the below reprex.
> I am thinking that this should not be the intention? My workaround at the moment is making a check when reading back to write the empty string if the tzone attribute does not exist.
> Just to confirm, the attribute is stored correctly when it is not empty.
> Thanks.
> {code:java}
> ``` r
> dates <- as.POSIXct(c("2020-01-01", "2020-01-02", "2020-01-02"))
> attributes(dates)
> #> $class
> #> [1] "POSIXct" "POSIXt"
> #>
> #> $tzone
> #> [1] ""
> values <- c(1:3)
> original <- data.frame(dates, values)
> original
> #> dates values
> #> 1 2020-01-01 1
> #> 2 2020-01-02 2
> #> 3 2020-01-02 3
> tempfile <- tempfile()
> arrow::write_feather(original, tempfile)
> restored <- arrow::read_feather(tempfile)
> identical(original, restored)
> #> [1] FALSE
> waldo::compare(original, restored)
> #> `attr(old$dates, 'tzone')` is a character vector ('')
> #> `attr(new$dates, 'tzone')` is absent
> unlink(tempfile)
> ```
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)