You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/04/26 13:25:44 UTC

[GitHub] [arrow-datafusion] alamb opened a new issue #137: Allow ParquetExec to parallelize work based on row groups

alamb opened a new issue #137:
URL: https://github.com/apache/arrow-datafusion/issues/137


   *Note*: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-11056
   
   ParquetExec currently parallelizes work by passinging individual files to threads. It would be nice to be able to do this in a finer-grained way by assigning row groups and/or column chunks instead. This will be especially important in distributed systems built on DataFusion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Ted-Jiang removed a comment on issue #137: Allow ParquetExec to parallelize work based on row groups

Posted by GitBox <gi...@apache.org>.
Ted-Jiang removed a comment on issue #137:
URL: https://github.com/apache/arrow-datafusion/issues/137#issuecomment-1002455543


   @houqp plz assign this to me 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] Ted-Jiang commented on issue #137: Allow ParquetExec to parallelize work based on row groups

Posted by GitBox <gi...@apache.org>.
Ted-Jiang commented on issue #137:
URL: https://github.com/apache/arrow-datafusion/issues/137#issuecomment-1002455543


   @houqp plz assign this to me 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org