You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2021/07/17 18:26:00 UTC

[jira] [Resolved] (ARROW-12850) [R] is.nan() evaluates to null on Arrow null values

     [ https://issues.apache.org/jira/browse/ARROW-12850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neal Richardson resolved ARROW-12850.
-------------------------------------
    Resolution: Fixed

> [R] is.nan() evaluates to null on Arrow null values
> ---------------------------------------------------
>
>                 Key: ARROW-12850
>                 URL: https://issues.apache.org/jira/browse/ARROW-12850
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 4.0.0
>            Reporter: Ian Cook
>            Assignee: Ian Cook
>            Priority: Major
>             Fix For: 5.0.0
>
>
> {code:java}
> > is.nan(NA_real_)
> [1] FALSE 
> > as.vector(is.nan(Scalar$create(NA_real_)))
> [1] NA
> {code}
> There is a discrepancy here between the {{FALSE}} result in R and the {{null}} result in Arrow (which results in {{NA_logical_}} when converted to an R vector).
> I don't think the {{is_nan}} C++ kernel should change here because this is just a quirk of R. For example, NumPy and pandas is consistent with the Arrow C++:
> {code:java}
> > np.isnan(pd.NA)
> <NA>
> {code}
> We could maybe consider adding a boolean option to the {{is_nan}} C++ kernel to control whether to consider nulls as {{NaN}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)