You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2019/08/20 19:34:00 UTC

[jira] [Commented] (ARROW-4390) [R] Serialize "labeled" metadata in Feather files, IPC messages

    [ https://issues.apache.org/jira/browse/ARROW-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911685#comment-16911685 ] 

Neal Richardson commented on ARROW-4390:
----------------------------------------

See [https://haven.tidyverse.org/articles/semantics.html] for reference.

The naive approach would be to call {{haven::as_factor()}} on any "labelled" class objects before passing them from R to Arrow, but this is lossy and it can be handled by the user outside of the {{arrow}} package, so I'm not sure there's value in adding that here.

The best solution might be some kind of extension type that embellishes dictionary elements with additional properties. We might be able to do something useful with the missing data bitmask to capture the multiple types of missingness, short of creating an extension type. But it's hard for me to reason about the right thing to do here until there's more aggregation methods exposed from Arrow. So I'm putting this one back down.

> [R] Serialize "labeled" metadata in Feather files, IPC messages
> ---------------------------------------------------------------
>
>                 Key: ARROW-4390
>                 URL: https://issues.apache.org/jira/browse/ARROW-4390
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Wes McKinney
>            Assignee: Neal Richardson
>            Priority: Major
>             Fix For: 0.15.0
>
>
> see https://github.com/apache/arrow/issues/3480



--
This message was sent by Atlassian Jira
(v8.3.2#803003)