You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Tobias Zagorni (Jira)" <ji...@apache.org> on 2022/10/06 20:45:00 UTC
[jira] [Created] (ARROW-17956) [C++] RandomArrayGenerator does not properly generate ListArrays with Nulls
Tobias Zagorni created ARROW-17956:
--------------------------------------
Summary: [C++] RandomArrayGenerator does not properly generate ListArrays with Nulls
Key: ARROW-17956
URL: https://issues.apache.org/jira/browse/ARROW-17956
Project: Apache Arrow
Issue Type: Bug
Components: C++
Reporter: Tobias Zagorni
Assignee: Tobias Zagorni
There are multiple problems with the {{OffsetsFromLengthsArray}} method:
* There is an assumption that the first and last length value in the input are never null. This is not true at all for the usage of this method in GENERATE_LIST_CASE, where the input is completely randomly generated, respecting null_probability: [https://github.com/apache/arrow/blob/ed36fcd218d381bd7420f1b762a28c5feea4665f/cpp/src/arrow/testing/random.cc#L730]
* The SetBit call for non-null items is off-by-one. The index variable represents the index of the next offset, which is based of the current elements length. But the validity bit should still be set for the current element
* I don't see what effect the {{force_empty_nulls}} argument should have. I think the desired effect that Null items also have a zero length is always given, based on how the method is implemented. Please correct me if I'm wrong.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)