You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "David Li (Jira)" <ji...@apache.org> on 2022/03/31 12:00:00 UTC

[jira] [Commented] (ARROW-16081) Incorrect results when reading a buffer of boolean values

    [ https://issues.apache.org/jira/browse/ARROW-16081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515280#comment-17515280 ] 

David Li commented on ARROW-16081:
----------------------------------

NumPy's bool [is a byte|https://numpy.org/devdocs/user/basics.types.html] while Arrow's bool [is a bit|https://arrow.apache.org/docs/format/Columnar.html#fixed-size-primitive-layout]. Converting it via an array instead of a buffer will work:
{noformat}
>>> import pyarrow as pa
>>> import numpy as np
>>> data = np.array([True, False, True, False], dtype=bool)
>>> arr = pa.array(data)
>>> arr.to_numpy(zero_copy_only=False)
array([ True, False,  True, False])
>>> data
array([ True, False,  True, False])
{noformat}

> Incorrect results when reading a buffer of boolean values
> ---------------------------------------------------------
>
>                 Key: ARROW-16081
>                 URL: https://issues.apache.org/jira/browse/ARROW-16081
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 7.0.0
>         Environment: Ubuntu 20.04, Python 3.8.10, pyarrow==7.0.0
>            Reporter: Jonathan Kenyon
>            Priority: Major
>
> The following reproducer demonstrates that a buffer of boolean values is not correctly recovered when using pyarrow.
> {code:python}
> import pyarrow.parquet as pq
> import pyarrow as pa
> import numpy as np
> if __name__ == "__main__":
>     data = np.array([True, False, True, False], dtype=bool)
>     length = len(data)
>     buf = pa.py_buffer(data)
>     array = pa.Array.from_buffers(pa.bool_(), length, [None, buf])
>     np.testing.assert_array_equal(data, array.to_numpy(zero_copy_only=False))
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)