You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ian Cook (Jira)" <ji...@apache.org> on 2021/05/21 20:35:00 UTC

[jira] [Created] (ARROW-12850) [R] is.nan() evaluates to null on Arrow null values

Ian Cook created ARROW-12850:
--------------------------------

             Summary: [R] is.nan() evaluates to null on Arrow null values
                 Key: ARROW-12850
                 URL: https://issues.apache.org/jira/browse/ARROW-12850
             Project: Apache Arrow
          Issue Type: Improvement
          Components: R
            Reporter: Ian Cook


{code}
> is.nan(NA_real_)
[1] FALSE 

> as.vector(is.nan(Scalar$create(NA_real_)))
[1] NA
{code}
There is a discrepancy here between the {{FALSE}} result in R and the {{null}} result in Arrow (which results in {{NA_logical_}} when converted to an R vector).

I don't think the {{is.nan}} C++ kernel should change here because this is just a quirk of R. For example, NumPy and pandas is consistent with the Arrow C++:
{code}
> np.isnan(pd.NA)
<NA>
{code}
We could maybe consider adding a boolean option to the {{is.nan}} C++ kernel to control whether to consider nulls as {{NaN}}. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)