You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/06/30 13:28:31 UTC

[GitHub] [arrow-rs] alamb commented on a diff in pull request #4470: Append Row to Rows (#4466)

alamb commented on code in PR #4470:
URL: https://github.com/apache/arrow-rs/pull/4470#discussion_r1247865163


##########
arrow-row/src/lib.rs:
##########
@@ -832,14 +874,25 @@ struct RowConfig {
 #[derive(Debug)]
 pub struct Rows {
     /// Underlying row bytes
-    buffer: Box<[u8]>,
+    buffer: Vec<u8>,
     /// Row `i` has data `&buffer[offsets[i]..offsets[i+1]]`
-    offsets: Box<[usize]>,
+    offsets: Vec<usize>,
     /// The config for these rows
     config: RowConfig,
 }
 
 impl Rows {
+    /// Append a [`Row`] to this [`Rows`]
+    pub fn push(&mut self, row: Row<'_>) {
+        assert!(
+            Arc::ptr_eq(&row.config.fields, &self.config.fields),
+            "row was not produced by this RowConverter"
+        );
+        self.config.validate_utf8 |= row.config.validate_utf8;

Review Comment:
   Why doesn't this just assert that the values of `validate_utf8` are the same?



##########
arrow-row/src/lib.rs:
##########
@@ -756,6 +756,48 @@ impl RowConverter {
         unsafe { self.convert_raw(&mut rows, validate_utf8) }
     }
 
+    /// Returns an empty [`Rows`] with capacity for `row_capacity` rows with
+    /// a total length of `data_capacity`
+    ///
+    /// This can be used to buffer a selection of [`Row`]
+    ///
+    /// ```
+    /// # use std::sync::Arc;
+    /// # use std::collections::HashSet;
+    /// # use arrow_array::cast::AsArray;
+    /// # use arrow_array::StringArray;
+    /// # use arrow_row::{Row, RowConverter, SortField};
+    /// # use arrow_schema::DataType;
+    /// #
+    /// let mut converter = RowConverter::new(vec![SortField::new(DataType::Utf8)]).unwrap();

Review Comment:
   ```suggestion
       /// // This example shows how to buffer only the Row values
       /// let mut converter = RowConverter::new(vec![SortField::new(DataType::Utf8)]).unwrap();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org