You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by "Dandandan (via GitHub)" <gi...@apache.org> on 2023/04/19 11:45:07 UTC

[GitHub] [arrow-datafusion] Dandandan commented on issue #6050: Avoid RecordBatch clones during MemoryExec and MemoryStream construction

Dandandan commented on issue #6050:
URL: https://github.com/apache/arrow-datafusion/issues/6050#issuecomment-1514588857

   It doesn't really clone the data right? It clones `RecordBatch` which shouldn't be that expensive (cloning the `Vec` probably the most expensive when having many columns). 
   ```
   pub struct RecordBatch {
       schema: SchemaRef,
       columns: Vec<Arc<dyn Array>>,
   
       /// The number of rows in this RecordBatch
       ///
       /// This is stored separately from the columns to handle the case of no columns
       row_count: usize,
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org