You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2022/07/18 08:20:00 UTC
[jira] [Commented] (ARROW-17096) [C++] Mode kernel incorrect for boolean inputs
[ https://issues.apache.org/jira/browse/ARROW-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17567877#comment-17567877 ]
Joris Van den Bossche commented on ARROW-17096:
-----------------------------------------------
bq. Fiddling the buffer directly, looks pyarrow is treating the buffer as bitmap (one bit per value), not one byte per value like C++ compute kernel.
Isn't that then a bug in C++? I thought a BooleanArray is expected to use one bit per value (I wanted to point to the format docs to prove this, but it's actually not very explicitly said, only in a sidenote in the first paragraph at https://arrow.apache.org/docs/format/Columnar.html#fixed-size-primitive-layout, and also Schema.fbs doesn't mention it)
> [C++] Mode kernel incorrect for boolean inputs
> ----------------------------------------------
>
> Key: ARROW-17096
> URL: https://issues.apache.org/jira/browse/ARROW-17096
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++, Python
> Affects Versions: 8.0.0
> Reporter: Matthew Roeschke
> Assignee: Yibo Cai
> Priority: Major
>
> {code:java}
> In [1]: import pyarrow.compute as pc
> In [2]: import pyarrow as pa
> In [3]: pa.__version__
> Out[3]: '8.0.0'
> In [4]: pc.mode(pa.array([True, True]))
> # Correct
> Out[4]:
> <pyarrow.lib.StructArray object at 0x1266d5c60>
> -- is_valid: all not null
> -- child 0 type: bool
> [
> true
> ]
> -- child 1 type: int64
> [
> 2
> ]
> # Incorrect
> In [5]: pc.mode(pa.array([True, False]), 2)
> Out[5]:
> <pyarrow.lib.StructArray object at 0x1262110c0>
> -- is_valid: all not null
> -- child 0 type: bool
> [
> false, # should be true
> false
> ]
> -- child 1 type: int64
> [
> 1,
> 1
> ] {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)