You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/20 16:25:33 UTC

[GitHub] [arrow-datafusion] tustvold opened a new issue, #2291: Project Hive Partition Columns With ProjectionExec

tustvold opened a new issue, #2291:
URL: https://github.com/apache/arrow-datafusion/issues/2291

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   
   Part of #2079
   
   Currently the values of Hive partitions are projected within each of the various file format specific physical plans. As described in #2079 this has a number of drawbacks
   
   **Describe the solution you'd like**
   
   Rather than handling partition projection within the file scan operator, I would like to propose modifying `ListingTable` to add a `ProjectionExec` within `TableProvider::scan` instead of relying on `FileFormat::create_physical_plan` to do this. This `ProjectionExec` would be created with a set of literal expressions corresponding to the partition values
   
   Therefore instead of `TableProvider::scan` generating something like
   
   ```
   AvroExec:
   ```
   
   It would generate
   
   ```
   ProjectionExec: ...
       AvroExec: ...
   ```
   
   Note: this will depend on #2289 
   
   **Describe alternatives you've considered**
   
   The logic could instead be moved to `FileFormat::create_physical_plan` implementations, but I think it would be better to keep what is a catalog detail close to the catalog implementation.
   
   FYI @matthewmturner @yjshen @rdettai  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org