You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/30 16:02:04 UTC
[GitHub] [arrow-datafusion] tustvold opened a new issue, #2653: `ScalarValue::to_array_of_size` panics computing statistics for nested parquet file
tustvold opened a new issue, #2653:
URL: https://github.com/apache/arrow-datafusion/issues/2653
**Describe the bug**
```
let ctx = SessionContext::new();
let mut options = ParquetReadOptions::default()
.parquet_pruning(true)
.to_listing_options(2);
// Disable stats collection
options.collect_stat = true;
ctx.register_listing_table("patient", "/home/raphael/Downloads/part-00000-f6337bce-7fcd-4021-9f9d-040413ea83f8-c000.snappy.parquet", options, None).await.unwrap();
let df = ctx.sql("SELECT patient.meta FROM patient LIMIT 10").await.unwrap();
df.show().await.unwrap();
```
Where part-00000-f6337bce-7fcd-4021-9f9d-040413ea83f8-c000.snappy.parquet is the [parquet file](https://github.com/apache/arrow-datafusion/files/8626500/part-00000-f6337bce-7fcd-4021-9f9d-040413ea83f8-c000.snappy.parquet.zip) provided by @kesavkolla in https://github.com/apache/arrow-datafusion/issues/2439
Panics with
```
called `Result::unwrap()` on an `Err` value: ArrowError(ComputeError("concat requires input of at least one array"))
thread 'physical_plan::file_format::parquet::tests::temp' panicked at 'called `Result::unwrap()` on an `Err` value: ArrowError(ComputeError("concat requires input of at least one array"))', datafusion/common/src/scalar.rs:1206:18
stack backtrace:
0: rust_begin_unwind
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:143:14
2: core::result::unwrap_failed
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/result.rs:1785:5
3: core::result::Result<T,E>::unwrap
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/result.rs:1078:23
4: datafusion_common::scalar::ScalarValue::to_array_of_size
at /home/raphael/repos/external/arrow-datafusion/datafusion/common/src/scalar.rs:1198:22
5: datafusion_common::scalar::ScalarValue::to_array_of_size::{{closure}}
at /home/raphael/repos/external/arrow-datafusion/datafusion/common/src/scalar.rs:1253:45
6: core::iter::adapters::map::map_fold::{{closure}}
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/iter/adapters/map.rs:84:28
7: core::iter::traits::iterator::Iterator::fold
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/iter/traits/iterator.rs:2362:21
8: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/iter/adapters/map.rs:124:9
9: core::iter::traits::iterator::Iterator::for_each
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/iter/traits/iterator.rs:779:9
10: <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/vec/spec_extend.rs:40:17
11: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/vec/spec_from_iter_nested.rs:62:9
12: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/vec/spec_from_iter.rs:33:9
13: <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/vec/mod.rs:2554:9
14: core::iter::traits::iterator::Iterator::collect
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/iter/traits/iterator.rs:1784:9
15: datafusion_common::scalar::ScalarValue::to_array_of_size
at /home/raphael/repos/external/arrow-datafusion/datafusion/common/src/scalar.rs:1248:48
16: datafusion_common::scalar::ScalarValue::to_array
at /home/raphael/repos/external/arrow-datafusion/datafusion/common/src/scalar.rs:658:9
17: datafusion::datasource::get_statistics_with_limit::{{closure}}
at ./src/datasource/mod.rs:75:56
18: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
19: datafusion::datasource::listing::table::ListingTable::list_files_for_scan::{{closure}}
at ./src/datasource/listing/table.rs:394:67
20: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
21: <datafusion::datasource::listing::table::ListingTable as datafusion::datasource::datasource::TableProvider>::scan::{{closure}}
at ./src/datasource/listing/table.rs:310:53
22: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
23: <core::pin::Pin<P> as core::future::future::Future>::poll
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/future.rs:124:9
24: datafusion::physical_plan::planner::DefaultPhysicalPlanner::create_initial_plan::{{closure}}
at ./src/physical_plan/planner.rs:392:64
25: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
26: <core::pin::Pin<P> as core::future::future::Future>::poll
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/future.rs:124:9
27: datafusion::physical_plan::planner::DefaultPhysicalPlanner::create_initial_plan::{{closure}}
at ./src/physical_plan/planner.rs:623:84
28: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
29: <core::pin::Pin<P> as core::future::future::Future>::poll
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/future.rs:124:9
30: datafusion::physical_plan::planner::DefaultPhysicalPlanner::create_initial_plan::{{closure}}
```
Setting `options.collect_stat = false` eliminates the panic
**Expected behavior**
The above should not panic
**Additional context**
Follow on for https://github.com/apache/arrow-datafusion/issues/2453 which is fixed by https://github.com/apache/arrow-datafusion/pull/2631
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] tustvold commented on issue #2653: `ScalarValue::to_array_of_size` panics computing statistics for nested parquet file
Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #2653:
URL: https://github.com/apache/arrow-datafusion/issues/2653#issuecomment-1145789429
Huzzah, can confirm :tada:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] AssHero commented on issue #2653: `ScalarValue::to_array_of_size` panics computing statistics for nested parquet file
Posted by GitBox <gi...@apache.org>.
AssHero commented on issue #2653:
URL: https://github.com/apache/arrow-datafusion/issues/2653#issuecomment-1145623700
I think the merge request #2671 already fix this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] tustvold closed issue #2653: `ScalarValue::to_array_of_size` panics computing statistics for nested parquet file
Posted by GitBox <gi...@apache.org>.
tustvold closed issue #2653: `ScalarValue::to_array_of_size` panics computing statistics for nested parquet file
URL: https://github.com/apache/arrow-datafusion/issues/2653
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org