You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/05 13:38:57 UTC

[GitHub] [arrow-rs] zeevm commented on pull request #1154: Add `async` arrow parquet reader

zeevm commented on pull request #1154:
URL: https://github.com/apache/arrow-rs/pull/1154#issuecomment-1030627167


   I see a few issues with this.
   
   First, the notion that the column chunk is the basic i/o unit for Parquet is somewhat outdates with the introduction of the index page.
   
   Second, a major premise of Parquet is "read only what you need", where what you need is usually dictated by some query engine, so continuously downloading in the background for data the client may doesn't even want or need doesn't seem right, especially as the cost is complicating all existing client by the added "Send" constraint. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org