You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/06/08 13:04:11 UTC

[GitHub] [arrow-rs] alamb commented on a diff in pull request #4380: Add NullBuffer and BooleanBuffer From conversions

alamb commented on code in PR #4380:
URL: https://github.com/apache/arrow-rs/pull/4380#discussion_r1223007188


##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -299,6 +299,17 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
     /// # Panics
     ///
     /// Panics if [`Self::try_new`] returns an error
+    ///
+    /// ```

Review Comment:
   Giving a little context about when to use this constructor I think might help
   
   ```suggestion
       ///
       /// # Example
       ///
       /// Creating a [`PrimitiveArray`] directly from a [`ScalarBuffer`] and [`NullBuffer`]  ls generally
       /// the most performant approach and avoids any additional allocations.
       ///
       /// ```
   ```



##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -299,6 +299,17 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
     /// # Panics
     ///
     /// Panics if [`Self::try_new`] returns an error
+    ///
+    /// ```
+    /// # use arrow_array::Int32Array;
+    /// # use arrow_array::types::Int32Type;
+    /// # use arrow_buffer::NullBuffer;
+    /// // [1, 2, 3, 4]
+    /// let array = Int32Array::new(vec![1, 2, 3, 4].into(), None);
+    /// // [1, null, 3, 4]
+    /// let nulls = NullBuffer::from(vec![true, false, true, true]);

Review Comment:
   I agree -- it should be called the `ValidBuffer` or else have defined `1` to mean null. As you say 🤷 



##########
arrow-array/src/array/primitive_array.rs:
##########
@@ -299,6 +299,17 @@ impl<T: ArrowPrimitiveType> PrimitiveArray<T> {
     /// # Panics
     ///
     /// Panics if [`Self::try_new`] returns an error
+    ///
+    /// ```
+    /// # use arrow_array::Int32Array;
+    /// # use arrow_array::types::Int32Type;
+    /// # use arrow_buffer::NullBuffer;
+    /// // [1, 2, 3, 4]
+    /// let array = Int32Array::new(vec![1, 2, 3, 4].into(), None);
+    /// // [1, null, 3, 4]
+    /// let nulls = NullBuffer::from(vec![true, false, true, true]);

Review Comment:
   BTW could this be 
   
   ```suggestion
       /// let nulls = NullBuffer::from(&[true, false, true, true]);
   ```
   
   And potentially avoid the allocation? I see you added that `From` impl 🤔 



##########
arrow-buffer/src/buffer/boolean.rs:
##########
@@ -265,6 +265,30 @@ impl<'a> IntoIterator for &'a BooleanBuffer {
     }
 }
 
+impl From<&[bool]> for BooleanBuffer {
+    fn from(value: &[bool]) -> Self {
+        let mut builder = BooleanBufferBuilder::new(value.len());
+        builder.append_slice(value);
+        builder.finish()
+    }
+}
+
+impl From<Vec<bool>> for BooleanBuffer {
+    fn from(value: Vec<bool>) -> Self {
+        value.as_slice().into()
+    }
+}
+
+impl FromIterator<bool> for BooleanBuffer {
+    fn from_iter<T: IntoIterator<Item = bool>>(iter: T) -> Self {
+        let iter = iter.into_iter();
+        let (hint, _) = iter.size_hint();

Review Comment:
   given the hint is used for allocation I wonder if we should bound it (like avoid crash due to allocation error with usize::MAX)
   
   The docs say it should return `0` but isn't required
   
   https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.size_hint
   
   Maybe this is just a theoretical concern. If the same pattern exists elsewhere in the code I think this is fine oo



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org