You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/27 13:39:47 UTC

[GitHub] [arrow-datafusion] dev870 commented on issue #1484: Is it possible to query multiple parquet files ?

dev870 commented on issue #1484:
URL: https://github.com/apache/arrow-datafusion/issues/1484#issuecomment-1001575085


   Thanks a lot @Igosuki . Based on your example, I tried below - but I am getting row count from only the first parquet file inside `/path/to/datafusion-parquet/data/`, I was expecting to get results from all the files in the path. 
   
   I am struggling to understand what am I missing here...
   
   ```rust
   /// This example demonstrates executing a simple query against an Arrow data source (Parquet) and
   /// fetching results
   #[tokio::main]
   async fn main() -> Result<()> {
       // create local execution context
       let mut ctx = ExecutionContext::new();
       let file_format = ParquetFormat::default().with_enable_pruning(true);
   
       let listing_options = ListingOptions {
           file_extension: ".parquet".to_owned(),
           format: Arc::new(file_format),
           table_partition_cols: vec![],
           collect_stat: true,
           target_partitions: 1,
       };
   
       ctx.register_listing_table(
           "my_table",
           &format!("file://{}", "/path/to/datafusion-parquet/data/"),
           listing_options,
           None,
       ).await.unwrap();
       
       // execute the query
       let df = ctx.sql("SELECT * FROM my_table").await?;
   
       // print the results
       let batches = df.collect().await?;
       print!("{}", batches[0].num_rows());
       Ok(())
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org