You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/18 21:28:09 UTC
[GitHub] [arrow-rs] jhorstmann opened a new issue, #1587: test_read_maps fails when force_validate is active
jhorstmann opened a new issue, #1587:
URL: https://github.com/apache/arrow-rs/issues/1587
**Describe the bug**
The validation fails when comparing the `nullable` flag of a MapArray data field.
```
---- arrow::arrow_reader::tests::test_read_maps stdout ----
thread 'arrow::arrow_reader::tests::test_read_maps' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidArgumentError("Child type mismatch for Struct([Field { name: \"key\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Boolean, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, false), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: None }]). Expected Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Boolean, nullable: false, dict_id: 0, dict
_is_ordered: false, metadata: None }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, false) but child data had Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Boolean, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: None }, false)")', arrow/src/array/data.rs:308:34
stack backtrace:
0: rust_begin_unwind
at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/panicking.rs:143:14
2: core::result::unwrap_failed
at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/result.rs:1749:5
3: core::result::Result<T,E>::unwrap
at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/result.rs:1065:23
4: arrow::array::data::ArrayData::new_unchecked
at /home/jhorstmann/Source/github/apache/arrow-rs/arrow/src/array/data.rs:308:9
5: arrow::array::data::ArrayDataBuilder::build_unchecked
at /home/jhorstmann/Source/github/apache/arrow-rs/arrow/src/array/data.rs:1446:9
6: <parquet::arrow::array_reader::map_array::MapArrayReader as parquet::arrow::array_reader::ArrayReader>::next_batch
at ./src/arrow/array_reader/map_array.rs:106:35
7: <parquet::arrow::array_reader::StructArrayReader as parquet::arrow::array_reader::ArrayReader>::next_batch::{{closure}}
at ./src/arrow/array_reader.rs:719:27
8: core::iter::adapters::map::map_try_fold::{{closure}}
at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/iter/adapters/map.rs:91:28
9: core::iter::traits::iterator::Iterator::try_fold
at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/iter/traits/iterator.rs:2109:21
10: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold
at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/iter/adapters/map.rs:117:9
11: <parquet::arrow::array_reader::StructArrayReader as parquet::arrow::array_reader::ArrayReader>::next_batch
at ./src/arrow/array_reader.rs:716:30
12: <parquet::arrow::arrow_reader::ParquetRecordBatchReader as core::iter::traits::iterator::Iterator>::next
at ./src/arrow/arrow_reader.rs:229:15
13: parquet::arrow::arrow_reader::tests::test_read_maps
at ./src/arrow/arrow_reader.rs:1066:22
14: parquet::arrow::arrow_reader::tests::test_read_maps::{{closure}}
at ./src/arrow/arrow_reader.rs:1056:5
15: core::ops::function::FnOnce::call_once
at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/ops/function.rs:227:5
16: core::ops::function::FnOnce::call_once
at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
```
**To Reproduce**
```
$ RUST_BACKTRACE=1 cargo test --features force_validate -- test_read_map
```
**Additional context**
I haven't looked deeper into this. First idea was that maybe the validation should use `DataType::equals_datatype` for comparison, but that also compares the `nullable` flag.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] tustvold closed issue #1587: test_read_maps fails when force_validate is active
Posted by GitBox <gi...@apache.org>.
tustvold closed issue #1587: test_read_maps fails when force_validate is active
URL: https://github.com/apache/arrow-rs/issues/1587
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] viirya commented on issue #1587: test_read_maps fails when force_validate is active
Posted by GitBox <gi...@apache.org>.
viirya commented on issue #1587:
URL: https://github.com/apache/arrow-rs/issues/1587#issuecomment-1103291817
The inner map has data type in which the `key_value` field is nullable as false according to schema:
Map(Field { name: "key_value", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: "value", data_type: Map(Field { name: "key_value", data_type: Struct([Field { name: "key", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: "value", data_type: Boolean, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }]), **_nullable: false_**, dict_id: 0, dict_is_ordered: false, metadata: None }, false), **_nullable: true_**, dict_id: 0, dict_is_ordered: false, metadata: None }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, false)
But in `visit_map`, as it uses `Type.is_optional` as nullable flag for inner map, it takes optional value of the wrapping field of the map (i.e., the `value` field in top map) and that is true.
That is why child array data's map type is not the same as the schema in the nullable flag.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org