You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/08/27 12:03:01 UTC

[GitHub] [arrow] jorisvandenbossche commented on pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

jorisvandenbossche commented on pull request #6979:
URL: https://github.com/apache/arrow/pull/6979#issuecomment-681904949


   @wjones1 I tested this branch locally, and can actually reproduce the errors. 
   
   I don't see what you described above (https://github.com/apache/arrow/pull/6979#issuecomment-659143356) about batch_size now being a max number (the last batch of a single row group thus being potentially smaller). 
   
   When debugging the test for batch_size of 300 with chunk_size of 1000, I don't see the 4th batch having 100 rows (rows 900-1000), but actually also 300 (rows 900-1200).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org