You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/28 11:49:18 UTC

[GitHub] [arrow] alamb opened a new pull request #9347: ARROW-11414: [Rust] Reduce copies in Schema::try_merge

alamb opened a new pull request #9347:
URL: https://github.com/apache/arrow/pull/9347


   I was looking at this code yesterday while using it in IOx -- https://github.com/influxdata/influxdb_iox/pull/703
   
   ## Rationale:
   Even though `Schema::try_merge` requires a slice of `Schema`s (not schema refs) ownership of its inputs,  it copies all of its fields. This is inefficient ideal in the common case where most of the fields in the merged `Schema` will be the same
   
   ## Changes:
   This PR proposes to change the implementation so that `try_merge` takes something  (like a `Vec`) that can iterate over the Schemas passed in and consume them, avoiding at least one copy per unique field. I intend no algorithmic changes, only performance improvement.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9347: ARROW-11414: [Rust] Reduce copies in Schema::try_merge

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9347:
URL: https://github.com/apache/arrow/pull/9347#issuecomment-769000869


   https://issues.apache.org/jira/browse/ARROW-11414


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9347: ARROW-11414: [Rust] Reduce copies in Schema::try_merge

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9347:
URL: https://github.com/apache/arrow/pull/9347#issuecomment-771918785


   >  i.e. if a function takes reference but does internal clones, then it's better to let caller manage the ownership instead.
   
   Yes I like this approach a lot


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #9347: ARROW-11414: [Rust] Reduce copies in Schema::try_merge

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9347:
URL: https://github.com/apache/arrow/pull/9347#discussion_r566032725



##########
File path: rust/arrow/src/datatypes.rs
##########
@@ -1836,7 +1836,7 @@ impl Schema {
     /// ```
     /// use arrow::datatypes::*;
     ///
-    /// let merged = Schema::try_merge(&vec![
+    /// let merged = Schema::try_merge(vec![

Review comment:
       this example shows a pretty good example of how I think the usability (as well as performance) of this API is improved by this PR

##########
File path: rust/arrow/src/datatypes.rs
##########
@@ -1836,7 +1836,7 @@ impl Schema {
     /// ```
     /// use arrow::datatypes::*;
     ///
-    /// let merged = Schema::try_merge(&vec![
+    /// let merged = Schema::try_merge(vec![

Review comment:
       The change in this example shows a pretty good example of how I think the usability (as well as performance) of this API is improved by this PR




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9347: ARROW-11414: [Rust] Reduce copies in Schema::try_merge

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9347:
URL: https://github.com/apache/arrow/pull/9347#issuecomment-771918785


   >  i.e. if a function takes reference but does internal clones, then it's better to let caller manage the ownership instead.
   
   Yes I like this approach a lot


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb closed pull request #9347: ARROW-11414: [Rust] Reduce copies in Schema::try_merge

Posted by GitBox <gi...@apache.org>.
alamb closed pull request #9347:
URL: https://github.com/apache/arrow/pull/9347


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org