You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/05 14:59:03 UTC

[GitHub] [arrow-datafusion] Cheappie opened a new issue, #2161: Query execution fails with index out of bounds err

Cheappie opened a new issue, #2161:
URL: https://github.com/apache/arrow-datafusion/issues/2161

   **Describe the bug**
   Simply I get index out of bounds when parquet pruning is enabled.
   
   file: metadata.rs:212:10
   struct: RowGroupMetaData, 
   accessed field: columns
   error: thread 'tokio-runtime-worker' panicked at 'index out of bounds: the len is 1 but the index is 1'
   
   **To Reproduce**
   Create two parquet files with different fields in schema, I put 4 numbers into each file.
   
   ```
   file: sample1.parquet
   message schema {
       REQUIRED INT32 a;
   }
   
   file: sample2.parquet
   message schema {
       REQUIRED INT32 b;
   }
   ```
   
   code:
   ```
   #[tokio::main]
   async fn main() -> Result<()> {
       // create local execution context
       let mut ctx = ExecutionContext::new();
   
       // Configure listing options
       let file_format = ParquetFormat::default().with_enable_pruning(true);
       let listing_options = ListingOptions {
           file_extension: DEFAULT_PARQUET_EXTENSION.to_owned(),
           format: Arc::new(file_format),
           table_partition_cols: vec![],
           collect_stat: false,
           target_partitions: 1,
       };
   
       ctx.register_listing_table(
           "FANCY_TABLE",
           "file:///absolute-path/table/",
           listing_options,
           None,
       ).await.unwrap();
   
       let df = ctx
           .sql("SELECT * FROM FANCY_TABLE where a > 2 or b > 2")
           .await?;
   
       df.show().await?;
   
       Ok(())
   }
   ```
   
   **Expected behavior**
   Query executes without any issues.
   
   When pruning is disabled, everything is fine and I receive such result.
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Cheappie commented on issue #2161: Query execution fails with index out of bounds err

Posted by GitBox <gi...@apache.org>.
Cheappie commented on issue #2161:
URL: https://github.com/apache/arrow-datafusion/issues/2161#issuecomment-1092521108

   Thank you @thinkharderdev for resolving that issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb closed issue #2161: Query execution fails with index out of bounds err

Posted by GitBox <gi...@apache.org>.
alamb closed issue #2161: Query execution fails with index out of bounds err
URL: https://github.com/apache/arrow-datafusion/issues/2161


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] thinkharderdev commented on issue #2161: Query execution fails with index out of bounds err

Posted by GitBox <gi...@apache.org>.
thinkharderdev commented on issue #2161:
URL: https://github.com/apache/arrow-datafusion/issues/2161#issuecomment-1089123509

   This is related to the schema merging. I discovered this issue as well today and am working on a fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on issue #2161: Query execution fails with index out of bounds err

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #2161:
URL: https://github.com/apache/arrow-datafusion/issues/2161#issuecomment-1089315625

   Thanks @thinkharderdev 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org