You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "metesynnada (via GitHub)" <gi...@apache.org> on 2023/02/23 11:35:56 UTC

[GitHub] [arrow-rs] metesynnada commented on issue #3740: Support for Async CSV Writer

metesynnada commented on issue #3740:
URL: https://github.com/apache/arrow-rs/issues/3740#issuecomment-1441609389

   > I wonder if this could be achieved by simply writing a batch to an in-memory Vec using the current "blocking" writer, and then flushing the output to an async output. This would be more flexible, and likely significantly faster than an approach that integrates async at a lower level.
   
   I think this requires constantly creating a "blocking" writer for each record batch since it will own the in-memory `Vec`. The API for reaching the internal buffer is ` self.writer.into_inner()` which uses `mem::replace(self, None)`.
   
   I couldn't think of a solution on how to keep buffer ownership while writing with the usual Writer. Do you have any idea how I can code that?
   
   Btw, I verified the performance degradation, I agree with you that CPU-bound computations like serialization shouldn't be async since there is no gain. I am trying to isolate the IO-bound operation (flush) async as you said. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org