You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/18 21:28:09 UTC

[GitHub] [arrow-rs] jhorstmann opened a new issue, #1587: test_read_maps fails when force_validate is active

jhorstmann opened a new issue, #1587:
URL: https://github.com/apache/arrow-rs/issues/1587

   **Describe the bug**
   
   The validation fails when comparing the `nullable` flag of a MapArray data field.
   
   ```
   ---- arrow::arrow_reader::tests::test_read_maps stdout ----
   thread 'arrow::arrow_reader::tests::test_read_maps' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidArgumentError("Child type mismatch for Struct([Field { name: \"key\", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Boolean, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, false), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: None }]). Expected Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Boolean, nullable: false, dict_id: 0, dict
 _is_ordered: false, metadata: None }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, false) but child data had Map(Field { name: \"key_value\", data_type: Struct([Field { name: \"key\", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: \"value\", data_type: Boolean, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }]), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: None }, false)")', arrow/src/array/data.rs:308:34
   stack backtrace:
      0: rust_begin_unwind
                at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/std/src/panicking.rs:584:5
      1: core::panicking::panic_fmt
                at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/panicking.rs:143:14
      2: core::result::unwrap_failed
                at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/result.rs:1749:5
      3: core::result::Result<T,E>::unwrap
                at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/result.rs:1065:23
      4: arrow::array::data::ArrayData::new_unchecked
                at /home/jhorstmann/Source/github/apache/arrow-rs/arrow/src/array/data.rs:308:9
      5: arrow::array::data::ArrayDataBuilder::build_unchecked
                at /home/jhorstmann/Source/github/apache/arrow-rs/arrow/src/array/data.rs:1446:9
      6: <parquet::arrow::array_reader::map_array::MapArrayReader as parquet::arrow::array_reader::ArrayReader>::next_batch
                at ./src/arrow/array_reader/map_array.rs:106:35
      7: <parquet::arrow::array_reader::StructArrayReader as parquet::arrow::array_reader::ArrayReader>::next_batch::{{closure}}
                at ./src/arrow/array_reader.rs:719:27
      8: core::iter::adapters::map::map_try_fold::{{closure}}
                at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/iter/adapters/map.rs:91:28
      9: core::iter::traits::iterator::Iterator::try_fold
                at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/iter/traits/iterator.rs:2109:21
     10: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold
                at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/iter/adapters/map.rs:117:9
     11: <parquet::arrow::array_reader::StructArrayReader as parquet::arrow::array_reader::ArrayReader>::next_batch
                at ./src/arrow/array_reader.rs:716:30
     12: <parquet::arrow::arrow_reader::ParquetRecordBatchReader as core::iter::traits::iterator::Iterator>::next
                at ./src/arrow/arrow_reader.rs:229:15
     13: parquet::arrow::arrow_reader::tests::test_read_maps
                at ./src/arrow/arrow_reader.rs:1066:22
     14: parquet::arrow::arrow_reader::tests::test_read_maps::{{closure}}
                at ./src/arrow/arrow_reader.rs:1056:5
     15: core::ops::function::FnOnce::call_once
                at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/ops/function.rs:227:5
     16: core::ops::function::FnOnce::call_once
                at /rustc/4ce3749235fc31d15ebd444b038a9877e8c700d7/library/core/src/ops/function.rs:227:5
   note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
   ```
   
   **To Reproduce**
   
   ```
   $ RUST_BACKTRACE=1 cargo test --features force_validate -- test_read_map
   ```
   
   **Additional context**
   I haven't looked deeper into this. First idea was that maybe the validation should use `DataType::equals_datatype` for comparison, but that also compares the `nullable` flag.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold closed issue #1587: test_read_maps fails when force_validate is active

Posted by GitBox <gi...@apache.org>.
tustvold closed issue #1587: test_read_maps fails when force_validate is active 
URL: https://github.com/apache/arrow-rs/issues/1587


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] viirya commented on issue #1587: test_read_maps fails when force_validate is active

Posted by GitBox <gi...@apache.org>.
viirya commented on issue #1587:
URL: https://github.com/apache/arrow-rs/issues/1587#issuecomment-1103291817

   The inner map has data type in which the `key_value` field is nullable as false according to schema:
   
   Map(Field { name: "key_value", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: "value", data_type: Map(Field { name: "key_value", data_type: Struct([Field { name: "key", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, Field { name: "value", data_type: Boolean, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }]), **_nullable: false_**, dict_id: 0, dict_is_ordered: false, metadata: None }, false), **_nullable: true_**, dict_id: 0, dict_is_ordered: false, metadata: None }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }, false)
   
   But in `visit_map`, as it uses `Type.is_optional` as nullable flag for inner map, it takes optional value of the wrapping field of the map (i.e., the `value` field in top map) and that is true.
   
   That is why child array data's map type is not the same as the schema in the nullable flag.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org