You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alexandreyc (via GitHub)" <gi...@apache.org> on 2023/05/16 17:28:11 UTC

[GitHub] [arrow-rs] alexandreyc opened a new pull request, #4228: [Alternative] Add finish method to RecordBatchWriter trait

alexandreyc opened a new pull request, #4228:
URL: https://github.com/apache/arrow-rs/pull/4228

   # Which issue does this PR close?
   
   This PR does not close any particular issue. It is an alternative implementation of #4221 
   
   # Rationale for this change
    
   In #4221, the `finish` method takes `&mut self` whereas in this PR it takes `self`.
   
   We are forced to use fully qualified method calls for `finish` where we want to call `finish` on the object itself because `RecordBatchWriter` is in scope (because we `use arrow_array::*`). What's your opinion on importing only needed items from `arrow_array` to avoid this?
   
   # What changes are included in this PR?
   
   Add `finish` method to `RecordBatchWriter` and implement it for CSV, JSON, IPC and Parquet writers.
   
   # Are there any user-facing changes?
   
   According to my analysis no breaking change is introduced in this PR.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #4228: [Alternative] Add finish method to RecordBatchWriter trait

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on code in PR #4228:
URL: https://github.com/apache/arrow-rs/pull/4228#discussion_r1195486370


##########
arrow-array/src/record_batch.rs:
##########
@@ -47,6 +47,9 @@ pub trait RecordBatchReader: Iterator<Item = Result<RecordBatch, ArrowError>> {
 pub trait RecordBatchWriter {
     /// Write a single batch to the writer.
     fn write(&mut self, batch: &RecordBatch) -> Result<(), ArrowError>;
+
+    /// Write footer or termination data, then mark the writer as done.
+    fn finish(self) -> Result<(), ArrowError>;

Review Comment:
   ```suggestion
       fn close(self) -> Result<(), ArrowError>;
   ```
   
   Perhaps, this might avoid the name collisions?



##########
parquet/src/arrow/arrow_writer/mod.rs:
##########
@@ -250,6 +250,10 @@ impl<W: Write> RecordBatchWriter for ArrowWriter<W> {
     fn write(&mut self, batch: &RecordBatch) -> Result<(), ArrowError> {
         self.write(batch).map_err(|e| e.into())
     }
+
+    fn finish(self) -> std::result::Result<(), ArrowError> {
+        self.close().map(|_| ()).map_err(ArrowError::from)

Review Comment:
   ```suggestion
           self.close()?;
           Ok(())
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alexandreyc commented on pull request #4228: Add finish method to RecordBatchWriter trait

Posted by "alexandreyc (via GitHub)" <gi...@apache.org>.
alexandreyc commented on PR #4228:
URL: https://github.com/apache/arrow-rs/pull/4228#issuecomment-1550181253

   Should be good now! Thank you for your review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on pull request #4228: Add close method to RecordBatchWriter trait

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on PR #4228:
URL: https://github.com/apache/arrow-rs/pull/4228#issuecomment-1550195503

   For those following along, RecordBatchWriter hasn't yet been released so not marking as a breaking change


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold merged pull request #4228: Add close method to RecordBatchWriter trait

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold merged PR #4228:
URL: https://github.com/apache/arrow-rs/pull/4228


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org