You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/07 09:50:33 UTC

[GitHub] [arrow-datafusion] tustvold commented on pull request #1990: Make it possible to only scan part of a parquet file in a partition

tustvold commented on PR #1990:
URL: https://github.com/apache/arrow-datafusion/pull/1990#issuecomment-1091446532

   Makes sense to me, regardless of what happens with scheduling, having a mechanism to cheaply subdivide the input streams directly, as opposed to streaming the output through a repartitioning operator, seems like a useful feature to have.
   
   My expectation is scheduling will help with  over-provisioned parallelism in the plan, but will still need mechanisms to express that parallelism in the first place 👍


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org