You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/07/07 20:07:15 UTC

[GitHub] [arrow] paddyhoran opened a new pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

paddyhoran opened a new pull request #7666:
URL: https://github.com/apache/arrow/pull/7666


   Most libraries that use `Arrow` are likely to want to define types that produce `RecordBatch`'s.  This PR defines the `RecordBatchReader` and `SendableRecordBatchReader` traits and updates other crates (in particular, DataFusion) to use these traits.  In short, I think these traits should be defined in the main arrow crate.
   
   It was mentioned on the JIRA that we could just use a single trait and allow users to implement `Send`/`Sync` as necessary.  The fact the the `schema` method returns a reference counted type makes this hard to do (`Rc` vs `Arc`) so I settled on two different traits.  This seemed like the easiest to do in advance of 1.0, maybe we can improve on this.
   
   As `next` returns `Arrow`'s error type I added a new variant to allow wrapping external errors once they implement the standard `Error` trait and `Send`, `Sync`.  I remove the `PartialEq` and `Clone` bounds on the DataFusion error type as we were only using these in testing (the tests are updates) and it's best to keep the requirements on external implementers as low a possible.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
paddyhoran commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-655100102


   cc @houqp


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran commented on a change in pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
paddyhoran commented on a change in pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#discussion_r453044078



##########
File path: rust/datafusion/src/execution/physical_plan/common.rs
##########
@@ -47,12 +47,12 @@ impl RecordBatchIterator {
     }
 }
 
-impl BatchIterator for RecordBatchIterator {
+impl SendableRecordBatchReader for RecordBatchIterator {
     fn schema(&self) -> Arc<Schema> {

Review comment:
       I'm +0 on this.  I personally like seeing the explicit type signature (once it's not very verbose) but I have no problem with this change.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] houqp commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
houqp commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-657125605


   linter error will be addressed by https://github.com/apache/arrow/pull/7710


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
andygrove commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-657338473


   Thanks @paddyhoran ... sorry I didn't find the time to review, but I tested this earlier today and it looks good.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
paddyhoran commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-656306027


   cc @maxburke @mcassels you might have opinions on this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sunchao commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-657268017


   Merged. @andygrove feel free to leave comments in case you had time for a full review on this. 
   
   I tagged this as 1.0.0 but not sure whether it can make it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-655105633


   https://issues.apache.org/jira/browse/ARROW-8559


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] houqp commented on a change in pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
houqp commented on a change in pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#discussion_r451273265



##########
File path: rust/arrow/src/record_batch.rs
##########
@@ -216,15 +216,28 @@ impl Into<StructArray> for RecordBatch {
     }
 }
 
-/// Definition of record batch reader.
+/// Trait for types that can read `RecordBatch`'s in a single-threaded context.
 pub trait RecordBatchReader {
-    /// Returns schemas of this record batch reader.
-    /// Implementation of this trait should guarantee that all record batches returned
-    /// by this reader should have same schema as returned from this method.
+    /// Returns the schema of this `RecordBatchReader`.
+    ///
+    /// Implementation of this trait should guarantee that all `RecordBatch`'s returned by this
+    /// reader should have the same schema as returned from this method.
     fn schema(&mut self) -> SchemaRef;
 
-    /// Returns next record batch.
-    fn next_batch(&mut self) -> Result<Option<RecordBatch>>;
+    /// Reads the next `RecordBatch`.
+    fn next(&mut self) -> Result<Option<RecordBatch>>;
+}
+
+/// Trait for types that can read `RecordBatch`'s in a multi-threaded context.
+pub trait SendableRecordBatchReader: Send + Sync {

Review comment:
       Nitpick here because naming is hard and i don't have a good answer to this either ;P But the sendable prefix could be changed something more accurate since this trait also requires sync.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sunchao commented on a change in pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
sunchao commented on a change in pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#discussion_r452988944



##########
File path: rust/datafusion/src/execution/physical_plan/common.rs
##########
@@ -47,12 +47,12 @@ impl RecordBatchIterator {
     }
 }
 
-impl BatchIterator for RecordBatchIterator {
+impl SendableRecordBatchReader for RecordBatchIterator {
     fn schema(&self) -> Arc<Schema> {

Review comment:
       I suggest we replace this with `SchemaRef` to hide underlying impl details.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran commented on a change in pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
paddyhoran commented on a change in pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#discussion_r451719340



##########
File path: rust/datafusion/src/datasource/datasource.rs
##########
@@ -20,13 +20,13 @@
 use std::sync::{Arc, Mutex};
 
 use arrow::datatypes::Schema;
+use arrow::record_batch::SendableRecordBatchReader;
 
 use crate::error::Result;
-use crate::execution::physical_plan::BatchIterator;
 
-/// Returned by implementors of `Table#scan`, this `BatchIterator` is wrapped with
+/// Returned by implementors of `Table#scan`, this `SendableRecordBatchReader` is wrapped with
 /// an `Arc` and `Mutex` so that it can be shared across threads as it is used.
-pub type ScanResult = Arc<Mutex<dyn BatchIterator>>;
+pub type ScanResult = Arc<Mutex<dyn SendableRecordBatchReader>>;

Review comment:
       Yea I was wondering the same thing, but it's a little out of scope for this PR.  At least the new name kinda highlights this, we can address as a follow up PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran commented on a change in pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
paddyhoran commented on a change in pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#discussion_r451717658



##########
File path: rust/arrow/src/record_batch.rs
##########
@@ -216,15 +216,28 @@ impl Into<StructArray> for RecordBatch {
     }
 }
 
-/// Definition of record batch reader.
+/// Trait for types that can read `RecordBatch`'s in a single-threaded context.
 pub trait RecordBatchReader {
-    /// Returns schemas of this record batch reader.
-    /// Implementation of this trait should guarantee that all record batches returned
-    /// by this reader should have same schema as returned from this method.
+    /// Returns the schema of this `RecordBatchReader`.
+    ///
+    /// Implementation of this trait should guarantee that all `RecordBatch`'s returned by this
+    /// reader should have the same schema as returned from this method.
     fn schema(&mut self) -> SchemaRef;
 
-    /// Returns next record batch.
-    fn next_batch(&mut self) -> Result<Option<RecordBatch>>;
+    /// Reads the next `RecordBatch`.
+    fn next(&mut self) -> Result<Option<RecordBatch>>;
+}
+
+/// Trait for types that can read `RecordBatch`'s in a multi-threaded context.
+pub trait SendableRecordBatchReader: Send + Sync {

Review comment:
       I know I wasn't sure what to go with here, maybe `SendSyncRecordBatchReader`?  It's a bit verbose but probably the most accurate?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
paddyhoran commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-656856701


   > I think the `schema` method is already returning a `Arc<Schema>`, where is the `Rc` being used? IMHO if it is not too difficult to do we should consolidate on a single trait now instead of having to change it later again.
   
   It did occur to me that we could just have a single trait that uses `Arc` and not bother with the `Rc` version.  I'm not sure what the performance of `Arc` is versus `Rc`.  Obviously, there has to be some difference but is this important in the context of Arrow?  I'm up for converting to one trait and using `Arc`.  If users request it then we can add the `Rc` version later.  What do others think?
   
   The `Rc` version is used in a few places but again I don't think it's that important to use `Rc<Schema>` over `Arc<Schema>`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran commented on a change in pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
paddyhoran commented on a change in pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#discussion_r452371425



##########
File path: rust/arrow/src/record_batch.rs
##########
@@ -216,15 +216,28 @@ impl Into<StructArray> for RecordBatch {
     }
 }
 
-/// Definition of record batch reader.
+/// Trait for types that can read `RecordBatch`'s in a single-threaded context.
 pub trait RecordBatchReader {
-    /// Returns schemas of this record batch reader.
-    /// Implementation of this trait should guarantee that all record batches returned
-    /// by this reader should have same schema as returned from this method.
+    /// Returns the schema of this `RecordBatchReader`.
+    ///
+    /// Implementation of this trait should guarantee that all `RecordBatch`'s returned by this
+    /// reader should have the same schema as returned from this method.
     fn schema(&mut self) -> SchemaRef;
 
-    /// Returns next record batch.
-    fn next_batch(&mut self) -> Result<Option<RecordBatch>>;
+    /// Reads the next `RecordBatch`.
+    fn next(&mut self) -> Result<Option<RecordBatch>>;
+}
+
+/// Trait for types that can read `RecordBatch`'s in a multi-threaded context.
+pub trait SendableRecordBatchReader: Send + Sync {

Review comment:
       I'll go with `SendSync` as it's the most explicit.  We can then see if there is further feedback from users.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] houqp commented on a change in pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
houqp commented on a change in pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#discussion_r451278084



##########
File path: rust/arrow/src/record_batch.rs
##########
@@ -216,15 +216,28 @@ impl Into<StructArray> for RecordBatch {
     }
 }
 
-/// Definition of record batch reader.
+/// Trait for types that can read `RecordBatch`'s in a single-threaded context.
 pub trait RecordBatchReader {
-    /// Returns schemas of this record batch reader.
-    /// Implementation of this trait should guarantee that all record batches returned
-    /// by this reader should have same schema as returned from this method.
+    /// Returns the schema of this `RecordBatchReader`.
+    ///
+    /// Implementation of this trait should guarantee that all `RecordBatch`'s returned by this
+    /// reader should have the same schema as returned from this method.
     fn schema(&mut self) -> SchemaRef;
 
-    /// Returns next record batch.
-    fn next_batch(&mut self) -> Result<Option<RecordBatch>>;
+    /// Reads the next `RecordBatch`.
+    fn next(&mut self) -> Result<Option<RecordBatch>>;

Review comment:
       If we are to introduce breaking change in this patch set, I think it's better to use `next_batch` or other method name instead of `next` everywhere so it's easier for users to implement Iterator trait for the same type if they want.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
paddyhoran commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-657109850


   Ok, I think we are all in agreement. Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sunchao commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-656889046


   > The Rc version is used in a few places but again I don't think it's that important to use Rc<Schema> over Arc<Schema>.
   
   Agreed. I don't see any performance implication for this, as `schema()` shouldn't appear on hot paths.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
andygrove commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-655677147


   I've only skimmed through the changes but LGTM overall. I will try and find time this weekend for a full review. Thanks @paddyhoran I think this is a great improvement.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sunchao closed pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
sunchao closed pull request #7666:
URL: https://github.com/apache/arrow/pull/7666


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
paddyhoran commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-657110158


   I adopted `SchemaRef` throughout the codebase also.  I figure if we are going to use `SchemaRef` in some places we might as well adopt it universally.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran commented on a change in pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
paddyhoran commented on a change in pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#discussion_r451719804



##########
File path: rust/arrow/src/record_batch.rs
##########
@@ -216,15 +216,28 @@ impl Into<StructArray> for RecordBatch {
     }
 }
 
-/// Definition of record batch reader.
+/// Trait for types that can read `RecordBatch`'s in a single-threaded context.
 pub trait RecordBatchReader {
-    /// Returns schemas of this record batch reader.
-    /// Implementation of this trait should guarantee that all record batches returned
-    /// by this reader should have same schema as returned from this method.
+    /// Returns the schema of this `RecordBatchReader`.
+    ///
+    /// Implementation of this trait should guarantee that all `RecordBatch`'s returned by this
+    /// reader should have the same schema as returned from this method.
     fn schema(&mut self) -> SchemaRef;
 
-    /// Returns next record batch.
-    fn next_batch(&mut self) -> Result<Option<RecordBatch>>;
+    /// Reads the next `RecordBatch`.
+    fn next(&mut self) -> Result<Option<RecordBatch>>;

Review comment:
       Yea, that's probably a good point, I'll change to `next_batch`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] houqp commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
houqp commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-656943172


   +1 on unifying to Arc for the initial release given it's not on the hot path. It's better to start simple and optimize later. Arc's atomic reference counting tend to mess with compiler optimization and CPU cacheline, so it definitely has extra runtime overhead. But it shouldn't matter for schema. If get schema turned out to be on the hot path, then we have a bigger problem to fix :P I can't think of any reason why that should be the case.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] houqp commented on a change in pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
houqp commented on a change in pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#discussion_r451273911



##########
File path: rust/datafusion/src/datasource/datasource.rs
##########
@@ -20,13 +20,13 @@
 use std::sync::{Arc, Mutex};
 
 use arrow::datatypes::Schema;
+use arrow::record_batch::SendableRecordBatchReader;
 
 use crate::error::Result;
-use crate::execution::physical_plan::BatchIterator;
 
-/// Returned by implementors of `Table#scan`, this `BatchIterator` is wrapped with
+/// Returned by implementors of `Table#scan`, this `SendableRecordBatchReader` is wrapped with
 /// an `Arc` and `Mutex` so that it can be shared across threads as it is used.
-pub type ScanResult = Arc<Mutex<dyn BatchIterator>>;
+pub type ScanResult = Arc<Mutex<dyn SendableRecordBatchReader>>;

Review comment:
       Do we still need to wrap it with Arc and Mutex when `SendableRecordBatchReader` already implements Send and Sync? It looks like `ScanResult` can just be an alias to `SendableRecordBatchReader` or `Arc<Mutex<dyn RecordBatchReader>>`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sunchao commented on pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#issuecomment-656802087


   Thanks @paddyhoran .
   
   > It was mentioned on the JIRA that we could just use a single trait and allow users to implement Send/Sync as necessary. The fact the the schema method returns a reference counted type makes this hard to do (Rc vs Arc) so I settled on two different traits. This seemed like the easiest to do in advance of 1.0, maybe we can improve on this.
   
   I think the `schema` method is already returning a `Arc<Schema>`, where is the `Rc` being used? IMHO if it is not too difficult to do we should consolidate on a single trait now instead of having to change it later again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] houqp commented on a change in pull request #7666: ARROW-8559: [Rust] Consolidate Record Batch reader traits in main arrow crate

Posted by GitBox <gi...@apache.org>.
houqp commented on a change in pull request #7666:
URL: https://github.com/apache/arrow/pull/7666#discussion_r451955910



##########
File path: rust/arrow/src/record_batch.rs
##########
@@ -216,15 +216,28 @@ impl Into<StructArray> for RecordBatch {
     }
 }
 
-/// Definition of record batch reader.
+/// Trait for types that can read `RecordBatch`'s in a single-threaded context.
 pub trait RecordBatchReader {
-    /// Returns schemas of this record batch reader.
-    /// Implementation of this trait should guarantee that all record batches returned
-    /// by this reader should have same schema as returned from this method.
+    /// Returns the schema of this `RecordBatchReader`.
+    ///
+    /// Implementation of this trait should guarantee that all `RecordBatch`'s returned by this
+    /// reader should have the same schema as returned from this method.
     fn schema(&mut self) -> SchemaRef;
 
-    /// Returns next record batch.
-    fn next_batch(&mut self) -> Result<Option<RecordBatch>>;
+    /// Reads the next `RecordBatch`.
+    fn next(&mut self) -> Result<Option<RecordBatch>>;
+}
+
+/// Trait for types that can read `RecordBatch`'s in a multi-threaded context.
+pub trait SendableRecordBatchReader: Send + Sync {

Review comment:
       I like `SendSync`, same length as Sendable, but more accurate. `Threaded` is another potential candidate as well. I wish there is a shorter word that conveys the same meaning as `Concurrent` :D




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org