You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2020/04/23 18:57:00 UTC
[jira] [Resolved] (ARROW-8508) [Rust] ListBuilder of
FixedSizeListBuilder creates wrong offsets
[ https://issues.apache.org/jira/browse/ARROW-8508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andy Grove resolved ARROW-8508.
-------------------------------
Fix Version/s: 1.0.0
Resolution: Fixed
Issue resolved by pull request 7006
[https://github.com/apache/arrow/pull/7006]
> [Rust] ListBuilder of FixedSizeListBuilder creates wrong offsets
> ----------------------------------------------------------------
>
> Key: ARROW-8508
> URL: https://issues.apache.org/jira/browse/ARROW-8508
> Project: Apache Arrow
> Issue Type: Bug
> Components: Rust
> Affects Versions: 0.16.0
> Reporter: Christian Beilschmidt
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.0.0
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> I created an example of storing multi points with Arrow.
> # A coordinate consists of two floats (Float64Builder)
> # A multi point consists of one or more coordinates (FixedSizeListBuilder)
> # A list of multi points consists of multiple multi points (ListBuilder)
> This is the corresponding code snippet:
> {code:java}
> let float_builder = arrow::array::Float64Builder::new(0);
> let coordinate_builder = arrow::array::FixedSizeListBuilder::new(float_builder, 2);
> let mut multi_point_builder = arrow::array::ListBuilder::new(coordinate_builder);
> multi_point_builder
> .values()
> .values()
> .append_slice(&[0.0, 0.1])
> .unwrap();
> multi_point_builder.values().append(true).unwrap();
> multi_point_builder
> .values()
> .values()
> .append_slice(&[1.0, 1.1])
> .unwrap();
> multi_point_builder.values().append(true).unwrap();
> multi_point_builder.append(true).unwrap(); // first multi point
> multi_point_builder
> .values()
> .values()
> .append_slice(&[2.0, 2.1])
> .unwrap();
> multi_point_builder.values().append(true).unwrap();
> multi_point_builder
> .values()
> .values()
> .append_slice(&[3.0, 3.1])
> .unwrap();
> multi_point_builder.values().append(true).unwrap();
> multi_point_builder
> .values()
> .values()
> .append_slice(&[4.0, 4.1])
> .unwrap();
> multi_point_builder.values().append(true).unwrap();
> multi_point_builder.append(true).unwrap(); // second multi point
> let multi_point = dbg!(multi_point_builder.finish());
> let first_multi_point_ref = multi_point.value(0);
> let first_multi_point: &arrow::array::FixedSizeListArray = first_multi_point_ref.as_any().downcast_ref().unwrap();
> let coordinates_ref = first_multi_point.values();
> let coordinates: &Float64Array = coordinates_ref.as_any().downcast_ref().unwrap();
> assert_eq!(coordinates.value_slice(0, 2 * 2), &[0.0, 0.1, 1.0, 1.1]);
> let second_multi_point_ref = multi_point.value(1);
> let second_multi_point: &arrow::array::FixedSizeListArray = second_multi_point_ref.as_any().downcast_ref().unwrap();
> let coordinates_ref = second_multi_point.values();
> let coordinates: &Float64Array = coordinates_ref.as_any().downcast_ref().unwrap();
> assert_eq!(coordinates.value_slice(0, 2 * 3), &[2.0, 2.1, 3.0, 3.1, 4.0, 4.1]);
> {code}
> The second assertion fails and the output is {{[0.0, 0.1, 1.0, 1.1, 2.0, 2.1]}}.
> Moreover, the debug output produced from {{dbg!}} confirms this:
> {noformat}
> [
> FixedSizeListArray<2>
> [
> PrimitiveArray<Float64>
> [
> 0.0,
> 0.1,
> ],
> PrimitiveArray<Float64>
> [
> 1.0,
> 1.1,
> ],
> ],
> FixedSizeListArray<2>
> [
> PrimitiveArray<Float64>
> [
> 0.0,
> 0.1,
> ],
> PrimitiveArray<Float64>
> [
> 1.0,
> 1.1,
> ],
> PrimitiveArray<Float64>
> [
> 2.0,
> 2.1,
> ],
> ],
> ]{noformat}
> The second list should contain the values 2-4.
>
> So either I am using the builder wrong or there is a bug with the offsets. I used {{0.16}} as well as the current {{master}} from GitHub.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)