You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/10/06 21:07:00 UTC

[jira] [Updated] (ARROW-17956) [C++] RandomArrayGenerator does not properly generate ListArrays with Nulls

     [ https://issues.apache.org/jira/browse/ARROW-17956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated ARROW-17956:
-----------------------------------
    Labels: pull-request-available  (was: )

> [C++] RandomArrayGenerator does not properly generate ListArrays with Nulls
> ---------------------------------------------------------------------------
>
>                 Key: ARROW-17956
>                 URL: https://issues.apache.org/jira/browse/ARROW-17956
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Tobias Zagorni
>            Assignee: Tobias Zagorni
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are multiple problems with the {{OffsetsFromLengthsArray}} method:
>  * There is an assumption that the first and last length value in the input are never null. This is not true at all for the usage of this method in GENERATE_LIST_CASE, where the input is completely randomly generated, respecting null_probability: [https://github.com/apache/arrow/blob/ed36fcd218d381bd7420f1b762a28c5feea4665f/cpp/src/arrow/testing/random.cc#L730]
>  * The SetBit call for non-null items is off-by-one. The index variable represents the index of the next offset, which is based of the current elements length. But the validity bit should still be set for the current element
>  *  I don't see what effect the {{force_empty_nulls}} argument should have. I think the desired effect that Null items also have a zero length is always given, based on how the method is implemented. Please correct me if I'm wrong.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)