You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/20 17:03:28 UTC

[GitHub] [arrow] mbrubeck commented on a change in pull request #9235: ARROW-11291: [Rust] Add extend to MutableBuffer (-25% for aritmetic, -97% for length)

mbrubeck commented on a change in pull request #9235:
URL: https://github.com/apache/arrow/pull/9235#discussion_r561128399



##########
File path: rust/arrow/src/buffer.rs
##########
@@ -963,11 +968,131 @@ impl MutableBuffer {
 
     /// Extends the buffer by `additional` bytes equal to `0u8`, incrementing its capacity if needed.
     #[inline]
-    pub fn extend(&mut self, additional: usize) {
+    pub fn extend_zeros(&mut self, additional: usize) {
         self.resize(self.len + additional, 0);
     }
 }
 
+/// # Safety
+/// `ptr` must be allocated for `old_capacity`.
+#[inline]
+unsafe fn reallocate(
+    ptr: NonNull<u8>,
+    old_capacity: usize,
+    new_capacity: usize,
+) -> (NonNull<u8>, usize) {
+    let new_capacity = bit_util::round_upto_multiple_of_64(new_capacity);
+    let new_capacity = std::cmp::max(new_capacity, old_capacity * 2);
+    let ptr = memory::reallocate(ptr, old_capacity, new_capacity);
+    (ptr, new_capacity)
+}
+
+impl<A: ArrowNativeType> Extend<A> for MutableBuffer {
+    #[inline]
+    fn extend<T: IntoIterator<Item = A>>(&mut self, iter: T) {
+        let iterator = iter.into_iter();
+        self.extend_from_iter(iterator)
+    }
+}
+
+impl MutableBuffer {
+    #[inline]
+    fn extend_from_iter<T: ArrowNativeType, I: Iterator<Item = T>>(
+        &mut self,
+        mut iterator: I,
+    ) {
+        let size = std::mem::size_of::<T>();
+
+        // this is necessary because of https://github.com/rust-lang/rust/issues/32155
+        let (mut ptr, mut capacity, mut len) = (self.data, self.capacity, self.len);
+        let mut dst = unsafe { ptr.as_ptr().add(len) as *mut T };
+
+        while let Some(item) = iterator.next() {
+            if len + size >= capacity {
+                let (lower, _) = iterator.size_hint();
+                let additional = (lower + 1) * size;
+                let (new_ptr, new_capacity) =
+                    unsafe { reallocate(ptr, capacity, len + additional) };

Review comment:
       This is unsound if unwinding is enabled, because `Iterator::next` may panic on a future iteration of the loop, causing `self` to be dropped while its `ptr` and `capacity` are incorrect.  This is why a drop guard like `SetLenOnDrop` is needed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org