You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Antoine Pitrou (JIRA)" <ji...@apache.org> on 2019/07/29 18:38:00 UTC

[jira] [Comment Edited] (ARROW-6043) [Python] Array equals returns incorrectly if NaNs are in arrays

    [ https://issues.apache.org/jira/browse/ARROW-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895517#comment-16895517 ] 

Antoine Pitrou edited comment on ARROW-6043 at 7/29/19 6:37 PM:
----------------------------------------------------------------

Actually, if you look on the C++ side (see `cpp/src/arrow/compare.h`), the equality comparisons now take an optional `EqualOptions` where you can choose to set `nans_equal` to true.

So I would suggest to also expose those parameters in Python. Perhaps like it is done in CSV (see _csv.pyx).


was (Author: pitrou):
Actually, if look on the C++ side (see `cpp/src/arrow/compare.h`), the equality comparisons now take an optional `EqualOptions` where you can choose to set `nans_equal` to true.

So I would suggest to also expose those parameters in Python. Perhaps like it is done in CSV (see _csv.pyx).

> [Python] Array equals returns incorrectly if NaNs are in arrays
> ---------------------------------------------------------------
>
>                 Key: ARROW-6043
>                 URL: https://issues.apache.org/jira/browse/ARROW-6043
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.14.1
>            Reporter: Keith Kraus
>            Priority: Major
>             Fix For: 1.0.0
>
>
> {code:python}
> import numpy as np
> import pyarrow as pa
> data = [0, 1, np.nan, None, 4]
> arr1 = pa.array(data)
> arr2 = pa.array(data)
> pa.Array.equals(arr1, arr2)
> {code}
> Unsure if this is expected behavior, but in Arrow 0.12.1 this returned `True` as compared to `False` in 0.14.1.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)