You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "viirya (via GitHub)" <gi...@apache.org> on 2023/02/22 20:41:51 UTC

[GitHub] [arrow-rs] viirya commented on a diff in pull request #3743: Use Typed Buffers in Arrays (#1811) (#1176)

viirya commented on code in PR #3743:
URL: https://github.com/apache/arrow-rs/pull/3743#discussion_r1114904368


##########
arrow-array/src/array/list_array.rs:
##########
@@ -243,14 +224,20 @@ impl<OffsetSize: OffsetSizeTrait> GenericListArray<OffsetSize> {
 
         let values = make_array(values);
         // Handle case of empty offsets
-        let offsets = match data.is_empty() && data.buffers()[0].is_empty() {
-            true => empty_offsets::<OffsetSize>().as_ptr() as *const _,
-            false => data.buffers()[0].as_ptr(),
+        let value_offsets = match data.is_empty() && data.buffers()[0].is_empty() {
+            true => OffsetBuffer::new_empty(),
+            false => {
+                let buffer = ScalarBuffer::new(
+                    data.buffers()[0].clone(),
+                    data.offset(),
+                    data.len() + 1,
+                );
+                // Safety:
+                // ArrayData is valid
+                unsafe { OffsetBuffer::new_unchecked(buffer) }
+            }

Review Comment:
   This seems occurring more than once. Maybe we can have a function for it?



##########
arrow-buffer/src/buffer/scalar.rs:
##########
@@ -48,39 +46,50 @@ impl<T: ArrowNativeType> ScalarBuffer<T> {
     /// * `bytes` is not large enough for the requested slice
     pub fn new(buffer: Buffer, offset: usize, len: usize) -> Self {
         let size = std::mem::size_of::<T>();
-        let offset_len = offset.checked_add(len).expect("length overflow");
-        let start_bytes = offset.checked_mul(size).expect("start bytes overflow");
-        let end_bytes = offset_len.checked_mul(size).expect("end bytes overflow");
-
-        let bytes = &buffer.as_slice()[start_bytes..end_bytes];
-
-        // SAFETY: all byte sequences correspond to a valid instance of T
-        let (prefix, offsets, suffix) = unsafe { bytes.align_to::<T>() };
-        assert!(
-            prefix.is_empty() && suffix.is_empty(),
-            "buffer is not aligned to {size} byte boundary"
-        );
-
-        let ptr = offsets.as_ptr();
-        Self { buffer, ptr, len }
+        let byte_offset = offset.checked_mul(size).expect("offset overflow");
+        let byte_len = len.checked_mul(size).expect("length overflow");
+        buffer.slice_with_length(byte_offset, byte_len).into()

Review Comment:
   No need to check alignment anymore?



##########
arrow-buffer/src/buffer/offset.rs:
##########
@@ -0,0 +1,58 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::buffer::ScalarBuffer;
+use crate::{ArrowNativeType, MutableBuffer};
+use std::ops::Deref;
+
+/// A non-empty buffer of monotonically increasing, positive integers
+#[derive(Debug, Clone)]
+pub struct OffsetBuffer<O: ArrowNativeType>(ScalarBuffer<O>);

Review Comment:
   ```suggestion
   pub struct OffsetBuffer<O: OffsetSizeTrait>(ScalarBuffer<O>);
   ```



##########
arrow-array/src/array/list_array.rs:
##########
@@ -118,15 +107,7 @@ impl<OffsetSize: OffsetSizeTrait> GenericListArray<OffsetSize> {
     /// Returns the offset values in the offsets buffer
     #[inline]
     pub fn value_offsets(&self) -> &[OffsetSize] {
-        // Soundness
-        //     pointer alignment & location is ensured by RawPtrBox
-        //     buffer bounds/offset is ensured by the ArrayData instance.
-        unsafe {
-            std::slice::from_raw_parts(
-                self.value_offsets.as_ptr().add(self.data.offset()),
-                self.len() + 1,
-            )
-        }

Review Comment:
   Oh that's good. 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org