You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/07/03 22:16:03 UTC

[GitHub] [arrow] wjones1 commented on pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

wjones1 commented on pull request #6979:
URL: https://github.com/apache/arrow/pull/6979#issuecomment-653687403


   Looking at the code, no longer think this `batch_size` parameter actually would affect those other read methods.
   
   There are a few different "batch_size" parameters floating around the `reader.cc`, but there's only one reference to the one in `reader_properties_` (`ArrowReaderProperties`): 
   https://github.com/apache/arrow/blob/edf24290046d95967d620104c5238f30ff032b6d/cpp/src/parquet/arrow/reader.cc#L797-L799
   
   As far as I can tell, that's exclusively on the code path for the RecordBatchReader, and not the other readers. So I don't think we need to add the parameter to those other methods.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org