You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/01 08:33:04 UTC

[GitHub] [arrow] sunchao commented on a change in pull request #7076: ARROW-8659: [Rust] ListBuilder allocate with_capacity

sunchao commented on a change in pull request #7076:
URL: https://github.com/apache/arrow/pull/7076#discussion_r418460308



##########
File path: rust/parquet/src/arrow/converter.rs
##########
@@ -128,7 +128,10 @@ pub struct Utf8ArrayConverter {}
 
 impl Converter<Vec<Option<ByteArray>>, StringArray> for Utf8ArrayConverter {
     fn convert(source: Vec<Option<ByteArray>>) -> Result<StringArray> {
-        let mut builder = StringBuilder::new(source.len());
+        let mut builder = StringBuilder::with_capacity(
+            source.len(),
+            source.len() * std::mem::size_of::<ByteArray>(),

Review comment:
       Yeah this could be overly pessimistic. A more accurate way is to calculate the sum of all lens. It may be worth it.

##########
File path: rust/arrow/src/array/builder.rs
##########
@@ -527,11 +527,18 @@ pub struct ListBuilder<T: ArrayBuilder> {
 impl<T: ArrayBuilder> ListBuilder<T> {
     /// Creates a new `ListArrayBuilder` from a given values array builder
     pub fn new(values_builder: T) -> Self {
-        let mut offsets_builder = Int32BufferBuilder::new(values_builder.len() + 1);
+        let capacity = values_builder.len();
+        Self::with_capacity(values_builder, capacity)
+    }
+
+    /// Creates a new `ListArrayBuilder` from a given values array builder
+    /// `append` may be called up to `capacity` times without triggering reallocation

Review comment:
       nit: maybe rephrase this to: "`capacity` is the number of items to pre-allocate space for in offset buffer of this builder"? similarly below.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org