You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Qingyou Meng (Jira)" <ji...@apache.org> on 2021/01/16 02:30:00 UTC
[jira] [Closed] (ARROW-11263) [Rust] problem of Field nullable
[ https://issues.apache.org/jira/browse/ARROW-11263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Qingyou Meng closed ARROW-11263.
--------------------------------
Resolution: Incomplete
> [Rust] problem of Field nullable
> --------------------------------
>
> Key: ARROW-11263
> URL: https://issues.apache.org/jira/browse/ARROW-11263
> Project: Apache Arrow
> Issue Type: Bug
> Reporter: Qingyou Meng
> Priority: Major
>
> Quoting from section *Schema message*
> [https://github.com/apache/arrow/blob/master/docs/source/format/Columnar.rst#schema-message]
> {noformat}
> Whether the field is semantically nullable. While this has no bearing on the
> array's physical layout, many systems distinguish nullable and non-nullable
> fields and we want to allow them to preserve this metadata to enable faithful
> schema round trips.{noformat}
> This can be read as: for a field with nullable set as true, when encounters null array data from the field, data processor CAN continue or refuse to process.
> In current rust implementation, apart from read Fields from schema, we also construct `Field` with datafusion and`Field::new`in arrow::array::*StructArray*.
> * in datafusion, the nullable is determined by DF schema
> * in arrow::array::StructArray::try_from(values: Vec<(&str, ArrayRef)>) , the nullable is determined actual data. This is error-prone if ArrayRef's null buffer are all 1s (built by builder). The following test shows a bug:
> {noformat}
> #[test]
> fn test_struct_bug() {
> let ints: ArrayRef = Arc::new(Int32Array::from(vec![
> Some(1),
> Some(2),
> Some(3),
> ]));
> let array = StructArray::try_from(vec![("f1", ints.clone())]) .unwrap()
> .data();
> let arrays = vec![array.as_ref()];
> let mut mutable = MutableArrayData::new(arrays, false, 0);
> mutable.extend(0, 1, 3);
> let data = mutable.freeze();
> let array = StructArray::from(Arc::new(data));
> let expected = StructArray::try_from(vec![
> ("f1", ints.slice(1, 2)),
> ])
> .unwrap();
> assert_eq!(array, expected);
> }{noformat}
> Conclusions:
> * It's questionable to set Field's nullable according to data.
> * Perhaps builders should set null buffer back to None when the buffer has all bits set.
> * StructArray::
> try_from(values: Vec<(&str, ArrayRef)>) incorrectly sets nullable when null buffer is Some with all bits set.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)