You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/15 15:02:57 UTC

[GitHub] [arrow] wjones127 commented on pull request #12627: ARROW-15860: [Python] Document RecordBatchReader

wjones127 commented on pull request #12627:
URL: https://github.com/apache/arrow/pull/12627#issuecomment-1068090917


   > I think it would be good to update the docstring of RecordBatchReader a bit, if we are going to include it in the online docs (well, updating it is useful anyway).
   
   Yeah, I likely need to provide an example of iterating over it. Anything else you  want to see.
   
   > we should indeed document it how it is exposed publicly, so pyarrow.ipc. But isn't that actually a bit a strange place to expose this? As purely the RecordBatchReader has not really anything to do with IPC? (its subclasses RecordBatchStream/FileReader do, but now we use the base class directly as well, it might make sense exposing this in the top-level namespace instead?)
   
   Yes my understanding is that RBR reader is a type that will be returned by a variety of functions in PyArrow. It's currently returned from `Dataset.Scanner.to_reader()`, various flight readers, and some IPC methods. There are also functions like `write_dataset()` that take RBR as arguments. 
   
   I wanted to document it so other libraries could know how to consume them.
   
   Would it be possible to move it out of the lib module?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org