You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/11/04 04:49:18 UTC

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8541: ARROW-10402: [Rust] Refactor array equality

jorgecarleitao commented on a change in pull request #8541:
URL: https://github.com/apache/arrow/pull/8541#discussion_r517095720



##########
File path: rust/arrow/src/array/data.rs
##########
@@ -207,6 +210,56 @@ impl ArrayData {
 
         size
     }
+
+    /// Creates a zero-copy slice of itself. This creates a new [ArrayData]
+    /// with a different offset, len and a shifted null bitmap.
+    ///
+    /// # Panics
+    ///
+    /// Panics if `offset + length > self.len()`.
+    pub fn slice(&self, offset: usize, length: usize) -> ArrayData {
+        assert!((offset + length) <= self.len());
+
+        let mut new_data = self.clone();
+
+        new_data.len = length;
+        new_data.offset = offset + self.offset;
+
+        new_data.null_count =
+            count_nulls(new_data.null_buffer(), new_data.offset, new_data.len);
+
+        new_data
+    }
+
+    /// Returns the buffer `buffer` as a slice of type `T`. When the expected buffer is bit-packed,
+    /// the slice is not offset.
+    #[inline]
+    pub(super) fn buffer<T>(&self, buffer: usize) -> &[T] {
+        let values = unsafe { self.buffers[buffer].data().align_to::<T>() };
+        if values.0.len() != 0 || values.2.len() != 0 {
+            panic!("The buffer is not byte-aligned with its interpretation")
+        };
+        if self.data_type.width(buffer).unwrap() == 1 {
+            // bitpacked buffers are offset differently and it is the consumers that need to worry about.
+            &values.1
+        } else {
+            &values.1[self.offset..]
+        }
+    }
+
+    /// Returns the buffer `buffer` as a slice of type `T`. When the expected buffer is bit-packed,
+    /// the slice is not offset.
+    #[inline]
+    pub fn buffer_data(&self, buffer: usize) -> &[u8] {
+        let values = self.buffers[buffer].data();
+        let width = self.data_type.width(buffer).unwrap();
+        if width == 1 {
+            // bitpacked buffers are offset differently and it is the consumers that need to worry about.
+            &values
+        } else {
+            &values[self.offset * (width / 8)..]
+        }
+    }
 }
 
 impl PartialEq for ArrayData {

Review comment:
       One Idea would be to use the equality code from this PR, but sprinkle it with `debug_assert_eq`, so that during debug, it is more clear where the difference is. An alternative is to use `assert_eq`, so that the two arrays are shown side-by-side (this only works for small arrays, though).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org