You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/07/28 03:37:05 UTC

[GitHub] [arrow-rs] Ted-Jiang commented on issue #2197: ArrayReader::skip_records API

Ted-Jiang commented on issue #2197:
URL: https://github.com/apache/arrow-rs/issues/2197#issuecomment-1197617137

   Yes, I agree this need improvement before make api public.
   
   > Much like RecordReader we need to separate read_records from consuming the resulting data, i.e. replace ArrayReader::next_batch with ArrayReader::read_records and ArrayReader::consume_batch.
   
   I think you mean: we can call `read_records` multiple times until there are enough values in buf then we can call  `consume_batch`. To make sure avoid small data patch (now if selection_len less than batch_size will return a batch with selection_len rows ).
   
   How about make this combine logic in `impl Iterator for ParquetRecordBatchReader` ，if we call  `read_records` multiple times it should depend on the `selections`,  why not  add a loop check in Iterator to feed enough rows in result batch🤔


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org