You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/11 22:21:17 UTC

[GitHub] [arrow] alamb commented on a change in pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.

alamb commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r555376474



##########
File path: rust/arrow/src/compute/kernels/comparison.rs
##########
@@ -184,23 +180,33 @@ fn is_like_pattern(c: char) -> bool {
 
 pub fn like_utf8_scalar(left: &StringArray, right: &str) -> Result<BooleanArray> {
     let null_bit_buffer = left.data().null_buffer().cloned();
-    let mut result = BooleanBufferBuilder::new(left.len());
+    let bytes = bit_util::ceil(left.len(), 8);

Review comment:
       this looks like it was an additional performance improvement?
   

##########
File path: rust/arrow/src/array/transform/utils.rs
##########
@@ -15,16 +15,14 @@
 // specific language governing permissions and limitations
 // under the License.
 
-use crate::{
-    array::OffsetSizeTrait, buffer::MutableBuffer, datatypes::ToByteSlice, util::bit_util,
-};
+use crate::{array::OffsetSizeTrait, buffer::MutableBuffer, util::bit_util};
 
 /// extends the `buffer` to be able to hold `len` bits, setting all bits of the new size to zero.
 #[inline]
-pub(super) fn reserve_for_bits(buffer: &mut MutableBuffer, len: usize) {
+pub(super) fn resize_for_bits(buffer: &mut MutableBuffer, len: usize) {

Review comment:
       this is just a rename to make the names match more closely what they do, right?

##########
File path: rust/arrow/src/buffer.rs
##########
@@ -758,39 +793,51 @@ impl MutableBuffer {
         }
     }
 
-    /// Ensures that this buffer has at least `capacity` slots in this buffer. This will
-    /// also ensure the new capacity will be a multiple of 64 bytes.
-    ///
-    /// Returns the new capacity for this buffer.
-    pub fn reserve(&mut self, capacity: usize) -> usize {
-        if capacity > self.capacity {
-            let new_capacity = bit_util::round_upto_multiple_of_64(capacity);
-            let new_capacity = cmp::max(new_capacity, self.capacity * 2);
+    /// Ensures that this buffer has at least `self.len + additional` bytes. This re-allocates iff
+    /// `self.len + additional > capacity`.
+    /// # Example
+    /// ```
+    /// # use arrow::buffer::{Buffer, MutableBuffer};
+    /// let mut buffer = MutableBuffer::new(0);
+    /// buffer.reserve(253); // allocates for the first time
+    /// (0..253u8).for_each(|i| buffer.push(i)); // no reallocation
+    /// let buffer: Buffer = buffer.into();
+    /// assert_eq!(buffer.len(), 253);
+    /// ```
+    // For performance reasons, this must be inlined so that the `if` is executed inside the caller, and not as an extra call that just
+    // exits.
+    #[inline(always)]

Review comment:
       I would be that the use of `inline(always)` is what causes this PR to slow down some kernels (as the extra code either disables some additional optimization or blows some cache level)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org