You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/05 14:59:03 UTC
[GitHub] [arrow-datafusion] Cheappie opened a new issue, #2161: Query execution fails with index out of bounds err
Cheappie opened a new issue, #2161:
URL: https://github.com/apache/arrow-datafusion/issues/2161
**Describe the bug**
Simply I get index out of bounds when parquet pruning is enabled.
file: metadata.rs:212:10
struct: RowGroupMetaData,
accessed field: columns
error: thread 'tokio-runtime-worker' panicked at 'index out of bounds: the len is 1 but the index is 1'
**To Reproduce**
Create two parquet files with different fields in schema, I put 4 numbers into each file.
```
file: sample1.parquet
message schema {
REQUIRED INT32 a;
}
file: sample2.parquet
message schema {
REQUIRED INT32 b;
}
```
code:
```
#[tokio::main]
async fn main() -> Result<()> {
// create local execution context
let mut ctx = ExecutionContext::new();
// Configure listing options
let file_format = ParquetFormat::default().with_enable_pruning(true);
let listing_options = ListingOptions {
file_extension: DEFAULT_PARQUET_EXTENSION.to_owned(),
format: Arc::new(file_format),
table_partition_cols: vec![],
collect_stat: false,
target_partitions: 1,
};
ctx.register_listing_table(
"FANCY_TABLE",
"file:///absolute-path/table/",
listing_options,
None,
).await.unwrap();
let df = ctx
.sql("SELECT * FROM FANCY_TABLE where a > 2 or b > 2")
.await?;
df.show().await?;
Ok(())
}
```
**Expected behavior**
Query executes without any issues.
When pruning is disabled, everything is fine and I receive such result.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] Cheappie commented on issue #2161: Query execution fails with index out of bounds err
Posted by GitBox <gi...@apache.org>.
Cheappie commented on issue #2161:
URL: https://github.com/apache/arrow-datafusion/issues/2161#issuecomment-1092521108
Thank you @thinkharderdev for resolving that issue
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb closed issue #2161: Query execution fails with index out of bounds err
Posted by GitBox <gi...@apache.org>.
alamb closed issue #2161: Query execution fails with index out of bounds err
URL: https://github.com/apache/arrow-datafusion/issues/2161
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] thinkharderdev commented on issue #2161: Query execution fails with index out of bounds err
Posted by GitBox <gi...@apache.org>.
thinkharderdev commented on issue #2161:
URL: https://github.com/apache/arrow-datafusion/issues/2161#issuecomment-1089123509
This is related to the schema merging. I discovered this issue as well today and am working on a fix.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on issue #2161: Query execution fails with index out of bounds err
Posted by GitBox <gi...@apache.org>.
alamb commented on issue #2161:
URL: https://github.com/apache/arrow-datafusion/issues/2161#issuecomment-1089315625
Thanks @thinkharderdev
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org