You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/21 02:52:54 UTC

[GitHub] [arrow-datafusion] yjshen commented on issue #2293: Single File Per ParquetExec, AvroExec, etc...

yjshen commented on issue #2293:
URL: https://github.com/apache/arrow-datafusion/issues/2293#issuecomment-1104652888

   Do you mean `PartitionedFile` for File?
   ```rust
   pub struct PartitionedFile {
       /// Path for the file (e.g. URL, filesystem path, etc)
       pub file_meta: FileMeta,
       /// Values of partition columns to be appended to each row
       pub partition_values: Vec<ScalarValue>,
       /// An optional file range for a more fine-grained parallel execution
       pub range: Option<FileRange>,
   }
   ```
   
   I'm okay with the change, but regarding we directly translating Spark physical plan into DataFusion physical plan, is this possible we do this `Original ParquetExec -> UnionExec(SchemaAdapterExec(New ParquetExec))` in `ParquetExec`'s try_new method or somewhere related place in the physical plan?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org