You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "westonpace (via GitHub)" <gi...@apache.org> on 2023/04/20 13:45:48 UTC

[GitHub] [arrow] westonpace commented on issue #35000: [C++] How to make Scanner read parquet files faster?

westonpace commented on issue #35000:
URL: https://github.com/apache/arrow/issues/35000#issuecomment-1516358571

   That is a rather small batch size.  I don't know how much profiling or focus we've had on sizes that small.  You might try something larger, like 32k.  I would expect the scanner to be slightly slower than ParquetFileReader directly but not 2x.  If you don't support any multithreading, not even I/O readahead, then what advantages are you hoping to gain by using the scanner instead of ParquetFileReader?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org