You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/02 09:49:46 UTC

[GitHub] [arrow] jorgecarleitao opened a new pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

jorgecarleitao opened a new pull request #9076:
URL: https://github.com/apache/arrow/pull/9076


   This PR refactors `MutableBuffer::extend_from_slice` to remove the need to use `to_byte_slice` on every call, thereby removing its level of indirection, that does not allow the compiler to optimize out some code.
   
   This is the second performance improvement originally presented in #8796 and, together with #9027 , brings the performance of "MutableBuffer" to the same level as `Vec<u8>`, in particular to building buffers on the fly.
   
   Basically, when converting to a byte slice `&[u8]`, the compiler lost the type size information, and thus needs to perform extra checks and can't just optimize out the code.
   
   This PR adopts the same API as `Vec<T>::extend_from_slice`, but since our buffers are in `u8` (i.e. a la `Vec<u8>`), I made the signature
   
   ```
   pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T])
   pub fn push<T: ToByteSlice>(&mut self, item: &T)
   ```
   
   i.e. it consumes something that can be converted to a byte slice, but internally makes the conversion to bytes (as `to_byte_slice` was doing).
   
   Credits for the root cause analysis that lead to this PR go to @Dandandan, [originally fielded here](https://github.com/apache/arrow/pull/9016#discussion_r549110164).
   
   > [...] current conversion to a byte slice may add some overhead? - @Dandandan


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753458832


   Great! Besides being a speed up, being able to `push`.
   
   I did some quick benchmarking of your branch on DataFusion - I see a speed up of ~5% on q1 / q12. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb edited a comment on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
alamb edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-758263983


   FWIW given the "touch most of the codepaths" property of this PR,  it might be good to wait to merge this PR until after 3.0 is shipped (in the next few days) in case it introduces any subtle new bugs we would have longer to shake them out


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550932922



##########
File path: rust/arrow/src/buffer.rs
##########
@@ -863,44 +866,49 @@ impl MutableBuffer {
 
     /// View buffer as typed slice.
     pub fn typed_data_mut<T: ArrowNativeType>(&mut self) -> &mut [T] {
-        assert_eq!(self.len() % mem::size_of::<T>(), 0);
-        assert!(memory::is_ptr_aligned::<T>(self.data.cast()));
-        // JUSTIFICATION
-        //  Benefit
-        //      Many of the buffers represent specific types, and consumers of `Buffer` often need to re-interpret them.
-        //  Soundness
-        //      * The pointer is non-null by construction
-        //      * alignment asserted above
         unsafe {
-            std::slice::from_raw_parts_mut(
-                self.as_ptr() as *mut T,
-                self.len() / mem::size_of::<T>(),
-            )
+            let (prefix, offsets, suffix) = self.as_slice_mut().align_to_mut::<T>();
+            assert!(prefix.is_empty() && suffix.is_empty());
+            offsets
         }
     }
 
-    /// Extends the buffer from a byte slice, incrementing its capacity if needed.
+    /// Extends the buffer from a slice, increasing its capacity if needed.
     #[inline]
-    pub fn extend_from_slice(&mut self, bytes: &[u8]) {
-        let new_len = self.len + bytes.len();
+    pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T]) {
+        let additional = items.len() * std::mem::size_of::<T>();
+        let new_len = self.len + additional;
         if new_len > self.capacity {
-            self.reserve(new_len);
+            self.reserve(additional);
         }
         unsafe {
-            let dst = NonNull::new_unchecked(self.data.as_ptr().add(self.len));
-            let src = NonNull::new_unchecked(bytes.as_ptr() as *mut u8);
-            memory::memcpy(dst, src, bytes.len());
+            let dst = self.data.as_ptr().add(self.len) as *mut T;
+            let src = items.as_ptr() as *const T;
+            std::ptr::copy_nonoverlapping(src, dst, items.len())
         }
         self.len = new_len;
     }
 
-    /// Extends the buffer by `len` with all bytes equal to `0u8`, incrementing its capacity if needed.
-    pub fn extend(&mut self, len: usize) {
-        let remaining_capacity = self.capacity - self.len;
-        if len > remaining_capacity {
-            self.reserve(self.len + len);
+    /// Extends the buffer with a new item, increasing its capacity if needed.
+    #[inline]
+    pub fn push<T: ToByteSlice>(&mut self, item: T) {
+        let additional = std::mem::size_of::<T>();
+        let new_len = self.len + additional;
+        if new_len > self.capacity {
+            self.reserve(additional);
+        }
+        unsafe {
+            let dst = self.data.as_ptr().add(self.len) as *mut T;
+            let src = (&item) as *const T;
+            std::ptr::copy_nonoverlapping(src, dst, 1)

Review comment:
       Is this the same as `ptr::write(dst, item)`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-763032677


   🎉 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
   > Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (1548011) into [master](https://codecov.io/gh/apache/arrow/commit/9fe375434ca4f90ea7cbafcbadd0e0250270105d?el=desc) (9fe3754) will **decrease** coverage by `0.05%`.
   > The diff coverage is `94.48%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9076      +/-   ##
   ==========================================
   - Coverage   82.61%   82.56%   -0.06%     
   ==========================================
     Files         203      203              
     Lines       49942    49872      -70     
   ==========================================
   - Hits        41259    41175      -84     
   - Misses       8683     8697      +14     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
   | [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
   | [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
   | [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL21vZC5ycw==) | `88.29% <83.33%> (ø)` | |
   | [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `96.75% <91.52%> (-1.46%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
   | [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
   | ... and [24 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [9fe3754...1548011](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-763033846


   Well, it broke master... 🤣 @Dandandan already has a fix https://github.com/apache/arrow/pull/9269 💯 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550910490



##########
File path: rust/arrow/src/buffer.rs
##########
@@ -852,17 +870,34 @@ impl MutableBuffer {
         }
     }
 
-    /// Extends the buffer from a byte slice, incrementing its capacity if needed.
+    /// Extends the buffer from a slice, increasing its capacity if needed.
+    #[inline]
+    pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T]) {
+        let additional = items.len() * std::mem::size_of::<T>();
+        let new_len = self.len + additional;
+        if new_len > self.capacity {
+            self.reserve(additional);
+        }
+        unsafe {
+            let dst = self.data.as_ptr().add(self.len) as *mut T;
+            let src = items.as_ptr() as *const T;
+            std::ptr::copy_nonoverlapping(src, dst, items.len())
+        }
+        self.len = new_len;
+    }
+
+    /// Extends the buffer with a new item, increasing its capacity if needed.
     #[inline]
-    pub fn extend_from_slice(&mut self, bytes: &[u8]) {
-        let additional = bytes.len();
-        if self.len + additional > self.capacity() {
+    pub fn push<T: ToByteSlice>(&mut self, item: &T) {
+        let additional = std::mem::size_of::<T>();
+        let new_len = self.len + additional;
+        if new_len > self.capacity {

Review comment:
       I though about that also, but it requires more investigations. We would need to make `MutableBuffer` a generic over `T` and that has implications in our ability to build generic structures.
   
   For example, `builders::FieldData` relies on a generic `MutableBuffer`, and `MutableArrayData` also.
   
   Note that growing and freezing a `MutableBuffer` is now faster than `Vec<u32>` to store `u32`. We do still have a performance problem on which growing a `MutableBuffer` is faster than pre-reserving it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
   > Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (fb52aa2) into [master](https://codecov.io/gh/apache/arrow/commit/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa?el=desc) (4b7cdcb) will **decrease** coverage by `0.05%`.
   > The diff coverage is `94.48%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9076      +/-   ##
   ==========================================
   - Coverage   82.61%   82.56%   -0.06%     
   ==========================================
     Files         203      203              
     Lines       50140    50070      -70     
   ==========================================
   - Hits        41422    41338      -84     
   - Misses       8718     8732      +14     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
   | [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
   | [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
   | [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL21vZC5ycw==) | `88.29% <83.33%> (ø)` | |
   | [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `96.29% <91.52%> (-1.44%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
   | [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
   | ... and [24 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [4b7cdcb...fb52aa2](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753465093


   @Dandandan , could you run this again? I rebased it against the PR with the one that removes the zero alloc. I see a 20% improv on the buffer_create bench, but the datafusion is the real deal :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-758263983


   FWIW it might be good to wait to merge this PR until after 3.0 is shipped in case it introduces any subtle new bugs we would have longer to shake the out


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
   > Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (b9cd8fb) into [master](https://codecov.io/gh/apache/arrow/commit/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa?el=desc) (4b7cdcb) will **decrease** coverage by `0.06%`.
   > The diff coverage is `94.35%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9076      +/-   ##
   ==========================================
   - Coverage   82.61%   82.55%   -0.07%     
   ==========================================
     Files         203      204       +1     
     Lines       50140    50099      -41     
   ==========================================
   - Hits        41422    41357      -65     
   - Misses       8718     8742      +24     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
   | [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
   | [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
   | [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
   | [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `95.98% <89.09%> (-1.74%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
   | [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_string.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfc3RyaW5nLnJz) | `89.94% <100.00%> (-0.22%)` | :arrow_down: |
   | ... and [33 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [4b7cdcb...b9cd8fb](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550881085



##########
File path: rust/arrow/src/buffer.rs
##########
@@ -44,18 +44,35 @@ use crate::util::bit_util::ceil;
 #[cfg(any(feature = "simd", feature = "avx512"))]
 use std::borrow::BorrowMut;
 
-/// Buffer is a contiguous memory region of fixed size and is aligned at a 64-byte
-/// boundary. Buffer is immutable.
+/// Buffer represents a contiguous memory region that can be shared with other buffers and across
+/// thread boundaries.
 #[derive(Clone, PartialEq, Debug)]
 pub struct Buffer {
-    /// Reference-counted pointer to the internal byte buffer.
+    /// the internal byte buffer.
     data: Arc<Bytes>,
 
     /// The offset into the buffer.
     offset: usize,
 }
 
 impl Buffer {
+    /// Initializes a [Buffer] from a slice of items.
+    pub fn from_slice_ref<U: ArrowNativeType, T: AsRef<[U]>>(items: &T) -> Self {
+        // allocate aligned memory buffer
+        let slice = items.as_ref();
+        let len = slice.len() * std::mem::size_of::<U>();
+        let capacity = bit_util::round_upto_multiple_of_64(len);
+        let buffer = memory::allocate_aligned(capacity);
+        unsafe {

Review comment:
       Maybe document a justification for the unsafe here? :) 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753468173


   This PR:
   ```
   Query 1 iteration 0 took 608.4 ms
   Query 1 iteration 1 took 602.8 ms
   Query 1 iteration 2 took 601.4 ms
   Query 1 iteration 3 took 604.6 ms
   Query 1 iteration 4 took 604.0 ms
   Query 1 iteration 5 took 603.6 ms
   Query 1 iteration 6 took 605.5 ms
   Query 1 iteration 7 took 610.7 ms
   Query 1 iteration 8 took 607.5 ms
   Query 1 iteration 9 took 607.1 ms
   Query 1 avg time: 605.55 ms
   ```
   
   Master:
   ```
   Query 1 iteration 0 took 642.9 ms
   Query 1 iteration 1 took 640.5 ms
   Query 1 iteration 2 took 638.1 ms
   Query 1 iteration 3 took 639.8 ms
   Query 1 iteration 4 took 641.9 ms
   Query 1 iteration 5 took 646.4 ms
   Query 1 iteration 6 took 638.6 ms
   Query 1 iteration 7 took 640.6 ms
   Query 1 iteration 8 took 653.0 ms
   Query 1 iteration 9 took 648.6 ms
   Query 1 avg time: 643.04 ms
   ```
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao closed pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao closed pull request #9076:
URL: https://github.com/apache/arrow/pull/9076


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
   > Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (823f15d) into [master](https://codecov.io/gh/apache/arrow/commit/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa?el=desc) (4b7cdcb) will **decrease** coverage by `0.05%`.
   > The diff coverage is `94.48%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9076      +/-   ##
   ==========================================
   - Coverage   82.61%   82.55%   -0.06%     
   ==========================================
     Files         203      203              
     Lines       50140    50070      -70     
   ==========================================
   - Hits        41422    41337      -85     
   - Misses       8718     8733      +15     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
   | [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
   | [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
   | [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL21vZC5ycw==) | `88.29% <83.33%> (ø)` | |
   | [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `96.29% <91.52%> (-1.44%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
   | [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
   | ... and [25 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [4b7cdcb...823f15d](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io commented on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
   > Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (eeb55d6) into [master](https://codecov.io/gh/apache/arrow/commit/cd22be6efedbf9832b5ea875ca59bb42de7b6c28?el=desc) (cd22be6) will **decrease** coverage by `0.00%`.
   > The diff coverage is `96.05%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9076      +/-   ##
   ==========================================
   - Coverage   82.55%   82.55%   -0.01%     
   ==========================================
     Files         203      203              
     Lines       50043    50030      -13     
   ==========================================
   - Hits        41315    41303      -12     
   + Misses       8728     8727       -1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.66% <50.00%> (+2.79%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL21vZC5ycw==) | `88.29% <83.33%> (ø)` | |
   | [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `83.81% <88.23%> (-0.19%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `98.01% <92.30%> (-0.20%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.61% <100.00%> (ø)` | |
   | [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `93.04% <100.00%> (-0.06%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (+0.64%)` | :arrow_up: |
   | [rust/arrow/src/array/array\_string.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfc3RyaW5nLnJz) | `90.27% <100.00%> (ø)` | |
   | [rust/arrow/src/array/array\_union.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfdW5pb24ucnM=) | `89.53% <100.00%> (-0.05%)` | :arrow_down: |
   | [rust/arrow/src/array/data.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvZGF0YS5ycw==) | `97.31% <100.00%> (ø)` | |
   | ... and [17 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [cd22be6...eeb55d6](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550881141



##########
File path: rust/arrow/src/buffer.rs
##########
@@ -44,18 +44,35 @@ use crate::util::bit_util::ceil;
 #[cfg(any(feature = "simd", feature = "avx512"))]
 use std::borrow::BorrowMut;
 
-/// Buffer is a contiguous memory region of fixed size and is aligned at a 64-byte
-/// boundary. Buffer is immutable.
+/// Buffer represents a contiguous memory region that can be shared with other buffers and across
+/// thread boundaries.
 #[derive(Clone, PartialEq, Debug)]
 pub struct Buffer {
-    /// Reference-counted pointer to the internal byte buffer.
+    /// the internal byte buffer.
     data: Arc<Bytes>,
 
     /// The offset into the buffer.
     offset: usize,
 }
 
 impl Buffer {
+    /// Initializes a [Buffer] from a slice of items.
+    pub fn from_slice_ref<U: ArrowNativeType, T: AsRef<[U]>>(items: &T) -> Self {
+        // allocate aligned memory buffer
+        let slice = items.as_ref();
+        let len = slice.len() * std::mem::size_of::<U>();
+        let capacity = bit_util::round_upto_multiple_of_64(len);
+        let buffer = memory::allocate_aligned(capacity);
+        unsafe {

Review comment:
       I think it was there in the previous version of the code




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-757492622


   Ok, I have the benchmarks of this PR, which IMO is ready to review. FYI @alamb @nevi-me 
   
   ```bash
   critcmp master-08cccd68802c9ddc3ca0a5d4bad6e4ba382d74b4 perf_buffer-2dca3c85920c951d92992d343b69a5384178c6fd -t 10
   ```
   
   
   ```
   group                                         master-08cccd68802c9ddc3ca0a5d4bad6e4ba382d74b4    perf_buffer-2dca3c85920c951d92992d343b69a5384178c6fd
   -----                                         -----------------------------------------------    ----------------------------------------------------
   array_slice 512                               1.17   453.4±37.97ns        ? B/sec                1.00    387.0±8.80ns        ? B/sec
   array_string_from_vec 128                     1.28      4.2±0.22µs        ? B/sec                1.00      3.3±0.09µs        ? B/sec
   array_string_from_vec 256                     1.45      5.9±0.14µs        ? B/sec                1.00      4.1±0.12µs        ? B/sec
   array_string_from_vec 512                     1.66      9.6±0.28µs        ? B/sec                1.00      5.8±0.18µs        ? B/sec
   buffer_bit_ops and                            1.23  715.9±278.78ns        ? B/sec                1.00   581.4±16.32ns        ? B/sec
   cast date64 to date32 512                     1.11      7.6±0.40µs        ? B/sec                1.00      6.9±0.32µs        ? B/sec
   cast float64 to float32 512                   1.21      3.2±0.10µs        ? B/sec                1.00      2.7±0.06µs        ? B/sec
   cast float64 to uint64 512                    1.00      3.9±0.41µs        ? B/sec                1.11      4.4±0.12µs        ? B/sec
   cast int32 to float32 512                     1.12      3.2±0.11µs        ? B/sec                1.00      2.8±0.08µs        ? B/sec
   cast int32 to int64 512                       1.19      3.2±0.08µs        ? B/sec                1.00      2.7±0.07µs        ? B/sec
   cast int32 to uint32 512                      1.42      3.9±0.10µs        ? B/sec                1.00      2.8±0.15µs        ? B/sec
   cast int64 to int32 512                       1.96      5.1±0.27µs        ? B/sec                1.00      2.6±0.07µs        ? B/sec
   cast time32s to time64us 512                  1.11      5.4±0.23µs        ? B/sec                1.00      4.9±0.15µs        ? B/sec
   concat i32 1024                               1.00      4.2±0.09µs        ? B/sec                1.12      4.7±0.17µs        ? B/sec
   concat str nulls 1024                         1.00     21.3±0.74µs        ? B/sec                1.14     24.3±0.73µs        ? B/sec
   eq scalar Float32                             1.26     76.4±5.24µs        ? B/sec                1.00     60.6±3.56µs        ? B/sec
   filter context f32 high selectivity           1.00    309.7±1.96µs        ? B/sec                1.12   346.4±10.64µs        ? B/sec
   filter context u8 high selectivity            1.27      4.8±0.11µs        ? B/sec                1.00      3.8±0.14µs        ? B/sec
   filter context u8 w NULLs high selectivity    1.00    299.2±2.82µs        ? B/sec                1.12    336.0±7.21µs        ? B/sec
   filter u8 high selectivity                    1.11     12.2±0.18µs        ? B/sec                1.00     11.0±0.13µs        ? B/sec
   from_slice prepared                           1.00   834.5±26.98µs        ? B/sec                1.15   963.4±36.11µs        ? B/sec
   gt scalar Float32                             1.13     42.9±1.52µs        ? B/sec                1.00     37.9±1.83µs        ? B/sec
   length                                        1.58     57.8±0.58µs        ? B/sec                1.00     36.5±9.01µs        ? B/sec
   like_utf8 scalar complex                      1.00  1064.0±91.20µs        ? B/sec                1.17  1241.6±320.92µs        ? B/sec
   like_utf8 scalar ends with                    1.14    317.0±8.58µs        ? B/sec                1.00    277.5±5.80µs        ? B/sec
   like_utf8 scalar equals                       2.43    142.6±6.62µs        ? B/sec                1.00     58.6±2.81µs        ? B/sec
   like_utf8 scalar starts with                  1.41   294.5±16.24µs        ? B/sec                1.00    208.9±7.83µs        ? B/sec
   min string 512                                1.16      6.2±0.14µs        ? B/sec                1.00      5.3±0.12µs        ? B/sec
   mutable                                       1.19   587.7±15.96µs        ? B/sec                1.00   492.5±17.55µs        ? B/sec
   mutable prepared                              1.17   640.6±26.75µs        ? B/sec                1.00   549.0±21.62µs        ? B/sec
   mutable str nulls 1024                        1.00      4.0±0.06ms        ? B/sec                1.34      5.3±0.05ms        ? B/sec
   neq scalar Float32                            1.23     76.4±3.41µs        ? B/sec                1.00     61.9±6.27µs        ? B/sec
   nlike_utf8 scalar starts with                 1.00   321.1±12.95µs        ? B/sec                1.27    407.8±9.19µs        ? B/sec
   struct_array_from_vec 1024                    1.53     20.8±0.55µs        ? B/sec                1.00     13.6±0.66µs        ? B/sec
   struct_array_from_vec 128                     1.16      7.2±0.21µs        ? B/sec                1.00      6.2±0.23µs        ? B/sec
   struct_array_from_vec 256                     1.27      9.1±0.29µs        ? B/sec                1.00      7.1±0.20µs        ? B/sec
   struct_array_from_vec 512                     1.40     13.0±0.27µs        ? B/sec                1.00      9.3±0.32µs        ? B/sec
   take i32 1024                                 1.26      2.4±0.96µs        ? B/sec                1.00  1883.9±35.60ns        ? B/sec
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-761773033


   I apologize for the delay in merging Rust PRs -- the 3.0 release is being finalized now and are planning to minimize entropy by postponing merging  changes not critical for the release until the process was complete. I hope the process is complete in the next few days. There is more [discussion](https://lists.apache.org/list.html?dev@arrow.apache.org) in the mailing list 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
   > Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (ecd2769) into [master](https://codecov.io/gh/apache/arrow/commit/fdf5e88a67f33c0a76673a32938274f063c9cb93?el=desc) (fdf5e88) will **decrease** coverage by `0.05%`.
   > The diff coverage is `94.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9076      +/-   ##
   ==========================================
   - Coverage   82.57%   82.52%   -0.06%     
   ==========================================
     Files         204      204              
     Lines       50327    50247      -80     
   ==========================================
   - Hits        41560    41467      -93     
   - Misses       8767     8780      +13     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/array/equal/utils.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvZXF1YWwvdXRpbHMucnM=) | `74.50% <0.00%> (+0.72%)` | :arrow_up: |
   | [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
   | [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
   | [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `77.66% <ø> (ø)` | |
   | [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
   | [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `95.95% <88.67%> (-1.77%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.98% <100.00%> (ø)` | |
   | [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
   | ... and [24 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [fdf5e88...ecd2769](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
   > Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (c8cd289) into [master](https://codecov.io/gh/apache/arrow/commit/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa?el=desc) (4b7cdcb) will **decrease** coverage by `0.04%`.
   > The diff coverage is `94.48%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9076      +/-   ##
   ==========================================
   - Coverage   82.61%   82.56%   -0.05%     
   ==========================================
     Files         203      203              
     Lines       50140    50070      -70     
   ==========================================
   - Hits        41422    41340      -82     
   - Misses       8718     8730      +12     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
   | [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
   | [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
   | [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL21vZC5ycw==) | `88.29% <83.33%> (ø)` | |
   | [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `96.79% <91.52%> (-0.94%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
   | [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
   | ... and [24 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [4b7cdcb...c8cd289](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r555376474



##########
File path: rust/arrow/src/compute/kernels/comparison.rs
##########
@@ -184,23 +180,33 @@ fn is_like_pattern(c: char) -> bool {
 
 pub fn like_utf8_scalar(left: &StringArray, right: &str) -> Result<BooleanArray> {
     let null_bit_buffer = left.data().null_buffer().cloned();
-    let mut result = BooleanBufferBuilder::new(left.len());
+    let bytes = bit_util::ceil(left.len(), 8);

Review comment:
       this looks like it was an additional performance improvement?
   

##########
File path: rust/arrow/src/array/transform/utils.rs
##########
@@ -15,16 +15,14 @@
 // specific language governing permissions and limitations
 // under the License.
 
-use crate::{
-    array::OffsetSizeTrait, buffer::MutableBuffer, datatypes::ToByteSlice, util::bit_util,
-};
+use crate::{array::OffsetSizeTrait, buffer::MutableBuffer, util::bit_util};
 
 /// extends the `buffer` to be able to hold `len` bits, setting all bits of the new size to zero.
 #[inline]
-pub(super) fn reserve_for_bits(buffer: &mut MutableBuffer, len: usize) {
+pub(super) fn resize_for_bits(buffer: &mut MutableBuffer, len: usize) {

Review comment:
       this is just a rename to make the names match more closely what they do, right?

##########
File path: rust/arrow/src/buffer.rs
##########
@@ -758,39 +793,51 @@ impl MutableBuffer {
         }
     }
 
-    /// Ensures that this buffer has at least `capacity` slots in this buffer. This will
-    /// also ensure the new capacity will be a multiple of 64 bytes.
-    ///
-    /// Returns the new capacity for this buffer.
-    pub fn reserve(&mut self, capacity: usize) -> usize {
-        if capacity > self.capacity {
-            let new_capacity = bit_util::round_upto_multiple_of_64(capacity);
-            let new_capacity = cmp::max(new_capacity, self.capacity * 2);
+    /// Ensures that this buffer has at least `self.len + additional` bytes. This re-allocates iff
+    /// `self.len + additional > capacity`.
+    /// # Example
+    /// ```
+    /// # use arrow::buffer::{Buffer, MutableBuffer};
+    /// let mut buffer = MutableBuffer::new(0);
+    /// buffer.reserve(253); // allocates for the first time
+    /// (0..253u8).for_each(|i| buffer.push(i)); // no reallocation
+    /// let buffer: Buffer = buffer.into();
+    /// assert_eq!(buffer.len(), 253);
+    /// ```
+    // For performance reasons, this must be inlined so that the `if` is executed inside the caller, and not as an extra call that just
+    // exits.
+    #[inline(always)]

Review comment:
       I would be that the use of `inline(always)` is what causes this PR to slow down some kernels (as the extra code either disables some additional optimization or blows some cache level)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
   > Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (f63e28e) into [master](https://codecov.io/gh/apache/arrow/commit/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa?el=desc) (4b7cdcb) will **decrease** coverage by `0.06%`.
   > The diff coverage is `94.30%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9076      +/-   ##
   ==========================================
   - Coverage   82.61%   82.54%   -0.07%     
   ==========================================
     Files         203      204       +1     
     Lines       50140    50096      -44     
   ==========================================
   - Hits        41422    41353      -69     
   - Misses       8718     8743      +25     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
   | [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
   | [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
   | [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
   | [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `95.95% <88.67%> (-1.77%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
   | [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
   | [rust/arrow/src/array/array\_string.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfc3RyaW5nLnJz) | `89.94% <100.00%> (-0.22%)` | :arrow_down: |
   | ... and [34 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [4b7cdcb...f63e28e](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753453873


   <!--
     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.
   -->
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on JIRA?
   https://issues.apache.org/jira/browse/ARROW
   
   Then could you also rename pull request title in the following format?
   
       ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
     * [Other pull requests](https://github.com/apache/arrow/pulls/)
     * [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb edited a comment on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
alamb edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-758263983


   FWIW given the "touch most of the codepaths" property of this PR,  it might be good to wait to merge this PR until after 3.0 is shipped in case it introduces any subtle new bugs we would have longer to shake the out


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
Dandandan edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753458832


   Great! Besides being a speed up, being able to `push` something of a certain type is nice.
   
   I did some quick benchmarking of your branch on DataFusion - I see a speed up of ~5% on q1 / q12. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550909022



##########
File path: rust/arrow/src/buffer.rs
##########
@@ -852,17 +870,34 @@ impl MutableBuffer {
         }
     }
 
-    /// Extends the buffer from a byte slice, incrementing its capacity if needed.
+    /// Extends the buffer from a slice, increasing its capacity if needed.
+    #[inline]
+    pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T]) {
+        let additional = items.len() * std::mem::size_of::<T>();
+        let new_len = self.len + additional;
+        if new_len > self.capacity {
+            self.reserve(additional);
+        }
+        unsafe {
+            let dst = self.data.as_ptr().add(self.len) as *mut T;
+            let src = items.as_ptr() as *const T;
+            std::ptr::copy_nonoverlapping(src, dst, items.len())
+        }
+        self.len = new_len;
+    }
+
+    /// Extends the buffer with a new item, increasing its capacity if needed.
     #[inline]
-    pub fn extend_from_slice(&mut self, bytes: &[u8]) {
-        let additional = bytes.len();
-        if self.len + additional > self.capacity() {
+    pub fn push<T: ToByteSlice>(&mut self, item: &T) {

Review comment:
       Could this be `item: T` too?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550870170



##########
File path: rust/arrow/src/buffer.rs
##########
@@ -56,6 +57,23 @@ pub struct Buffer {
 }
 
 impl Buffer {
+    /// Initializes a [Buffer] from a slice of items.
+    pub fn from_slice_ref<U: ArrowNativeType, T: AsRef<[U]>>(items: &T) -> Self {

Review comment:
       This was necessary because it is not possible to have this generic on `impl`, see https://users.rust-lang.org/t/how-to-implement-from-asref-u-for-generic-u/53533/5




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550881402



##########
File path: rust/arrow/src/compute/kernels/comparison.rs
##########
@@ -48,9 +48,8 @@ macro_rules! compare_op {
             combine_option_bitmap($left.data_ref(), $right.data_ref(), $left.len())?;
 
         let byte_capacity = bit_util::ceil($left.len(), 8);
-        let actual_capacity = bit_util::round_upto_multiple_of_64(byte_capacity);

Review comment:
       :+1: was just looking at this, was added without any need by me




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] maxburke commented on a change in pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
maxburke commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r560497145



##########
File path: rust/arrow/src/lib.rs
##########
@@ -139,15 +139,15 @@ mod arch;
 pub mod array;
 pub mod bitmap;
 pub mod buffer;
-pub mod bytes;
+mod bytes;
 pub mod compute;
 pub mod csv;
 pub mod datatypes;
 pub mod error;
 pub mod ffi;
 pub mod ipc;
 pub mod json;
-pub mod memory;
+mod memory;

Review comment:
       @jorgecarleitao 
   
   This change has broken a bunch of our code downstream as we build buffers directly. Can you make the memory module public again?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-763050303


   The fix is merged, so hopefully master is back 🟢 ✅ 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
Dandandan edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753458832


   Great! Besides being a speed up, being able to `push` something of a certain type.
   
   I did some quick benchmarking of your branch on DataFusion - I see a speed up of ~5% on q1 / q12. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-758262159


   I would also like to note that this is EPIC work @jorgecarleitao  -- nicely done.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550909002



##########
File path: rust/arrow/src/buffer.rs
##########
@@ -852,17 +870,34 @@ impl MutableBuffer {
         }
     }
 
-    /// Extends the buffer from a byte slice, incrementing its capacity if needed.
+    /// Extends the buffer from a slice, increasing its capacity if needed.
+    #[inline]
+    pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T]) {
+        let additional = items.len() * std::mem::size_of::<T>();
+        let new_len = self.len + additional;
+        if new_len > self.capacity {
+            self.reserve(additional);
+        }
+        unsafe {
+            let dst = self.data.as_ptr().add(self.len) as *mut T;
+            let src = items.as_ptr() as *const T;
+            std::ptr::copy_nonoverlapping(src, dst, items.len())
+        }
+        self.len = new_len;
+    }
+
+    /// Extends the buffer with a new item, increasing its capacity if needed.
     #[inline]
-    pub fn extend_from_slice(&mut self, bytes: &[u8]) {
-        let additional = bytes.len();
-        if self.len + additional > self.capacity() {
+    pub fn push<T: ToByteSlice>(&mut self, item: &T) {
+        let additional = std::mem::size_of::<T>();
+        let new_len = self.len + additional;
+        if new_len > self.capacity {

Review comment:
       I am wondering if we in a followup can make `len` / `capacity` similar as `Vec` to be in number of items instead of number of bytes?
   
   This makes some implementations easier, e.g. looking at Vec https://doc.rust-lang.org/src/alloc/vec.rs.html#1206

##########
File path: rust/arrow/src/buffer.rs
##########
@@ -852,17 +870,34 @@ impl MutableBuffer {
         }
     }
 
-    /// Extends the buffer from a byte slice, incrementing its capacity if needed.
+    /// Extends the buffer from a slice, increasing its capacity if needed.
+    #[inline]
+    pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T]) {
+        let additional = items.len() * std::mem::size_of::<T>();
+        let new_len = self.len + additional;
+        if new_len > self.capacity {
+            self.reserve(additional);
+        }
+        unsafe {
+            let dst = self.data.as_ptr().add(self.len) as *mut T;
+            let src = items.as_ptr() as *const T;
+            std::ptr::copy_nonoverlapping(src, dst, items.len())
+        }
+        self.len = new_len;
+    }
+
+    /// Extends the buffer with a new item, increasing its capacity if needed.
     #[inline]
-    pub fn extend_from_slice(&mut self, bytes: &[u8]) {
-        let additional = bytes.len();
-        if self.len + additional > self.capacity() {
+    pub fn push<T: ToByteSlice>(&mut self, item: &T) {

Review comment:
       Could this be `T` too?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753522240


   https://issues.apache.org/jira/browse/ARROW-11108


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org