You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/19 18:06:16 UTC

[GitHub] [arrow] jhorstmann opened a new pull request #7226: ARROW-8791 Allow creation of StringDictionaryBuilder with an existing array of dictionary values

jhorstmann opened a new pull request #7226:
URL: https://github.com/apache/arrow/pull/7226


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on a change in pull request #7226: ARROW-8791: [Rust] Allow creation of StringDictionaryBuilder with an existing array of dictionary values

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #7226:
URL: https://github.com/apache/arrow/pull/7226#discussion_r427587582



##########
File path: rust/arrow/src/array/builder.rs
##########
@@ -1334,6 +1334,35 @@ where
             map: HashMap::new(),
         }
     }
+
+    pub fn new_with_dictionary(

Review comment:
       may you please add a doc comment




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jhorstmann commented on pull request #7226: ARROW-8791: [Rust] Allow creation of StringDictionaryBuilder with an existing array of dictionary values

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on pull request #7226:
URL: https://github.com/apache/arrow/pull/7226#issuecomment-631474781


   While writing the docs, I was wondering what is actually the usecase for the existing `new` function taking key and value builders as argument vs. taking two capacity values. It reduces the possibility for mixing up those two capacities, but I can't think of a reason for creating a dictionary builder with something else than two freshly instantiated builders.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jhorstmann commented on pull request #7226: ARROW-8791: [Rust] Allow creation of StringDictionaryBuilder with an existing array of dictionary values

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on pull request #7226:
URL: https://github.com/apache/arrow/pull/7226#issuecomment-633995723


   Hi @nevi-me and @sunchao thanks for the review, if there are no further review comments it would be awesome if we could get this merged.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sunchao commented on a change in pull request #7226: ARROW-8791: [Rust] Allow creation of StringDictionaryBuilder with an existing array of dictionary values

Posted by GitBox <gi...@apache.org>.
sunchao commented on a change in pull request #7226:
URL: https://github.com/apache/arrow/pull/7226#discussion_r429492355



##########
File path: rust/arrow/src/array/builder.rs
##########
@@ -1334,6 +1334,57 @@ where
             map: HashMap::new(),
         }
     }
+
+    /// Creates a new `StringDictionaryBuilder` from a keys builder and a dictionary initialized with the given values.

Review comment:
       nit: can we limit the line width to 90 characters?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7226: ARROW-8791 Allow creation of StringDictionaryBuilder with an existing array of dictionary values

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7226:
URL: https://github.com/apache/arrow/pull/7226#issuecomment-630993571


   https://issues.apache.org/jira/browse/ARROW-8791


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] vertexclique commented on a change in pull request #7226: ARROW-8791: [Rust] Allow creation of StringDictionaryBuilder with an existing array of dictionary values

Posted by GitBox <gi...@apache.org>.
vertexclique commented on a change in pull request #7226:
URL: https://github.com/apache/arrow/pull/7226#discussion_r428057479



##########
File path: rust/arrow/src/array/builder.rs
##########
@@ -1334,6 +1334,58 @@ where
             map: HashMap::new(),
         }
     }
+
+    /// Creates a new `StringDictionaryBuilder` from a keys builder and a dictionary initialized with the given values.
+    /// The indices of those dictionary values are used as keys.
+    ///
+    /// # Example
+    ///
+    /// ```
+    /// use arrow::datatypes::Int16Type;
+    /// use arrow::array::{StringArray, StringDictionaryBuilder, PrimitiveBuilder};
+    /// use std::convert::TryFrom;
+    ///
+    /// let dictionary_values = StringArray::try_from(vec![None, Some("abc"), Some("def")]).unwrap();
+    ///
+    /// let mut builder = StringDictionaryBuilder::new_with_dictionary(PrimitiveBuilder::<Int16Type>::new(3), &dictionary_values).unwrap();
+    /// builder.append("def").unwrap();
+    /// builder.append_null().unwrap();
+    /// builder.append("abc").unwrap();
+    ///
+    /// let dictionary_array = builder.finish();
+    ///
+    /// let keys: Vec<Option<i16>> = dictionary_array.keys().collect();
+    ///
+    ///  assert_eq!(keys, vec![Some(2), None, Some(1)]);

Review comment:
       assert is off by one space.

##########
File path: rust/arrow/src/array/builder.rs
##########
@@ -1334,6 +1334,58 @@ where
             map: HashMap::new(),
         }
     }
+
+    /// Creates a new `StringDictionaryBuilder` from a keys builder and a dictionary initialized with the given values.
+    /// The indices of those dictionary values are used as keys.
+    ///
+    /// # Example
+    ///
+    /// ```
+    /// use arrow::datatypes::Int16Type;
+    /// use arrow::array::{StringArray, StringDictionaryBuilder, PrimitiveBuilder};
+    /// use std::convert::TryFrom;
+    ///
+    /// let dictionary_values = StringArray::try_from(vec![None, Some("abc"), Some("def")]).unwrap();
+    ///
+    /// let mut builder = StringDictionaryBuilder::new_with_dictionary(PrimitiveBuilder::<Int16Type>::new(3), &dictionary_values).unwrap();
+    /// builder.append("def").unwrap();
+    /// builder.append_null().unwrap();
+    /// builder.append("abc").unwrap();
+    ///
+    /// let dictionary_array = builder.finish();
+    ///
+    /// let keys: Vec<Option<i16>> = dictionary_array.keys().collect();
+    ///
+    ///  assert_eq!(keys, vec![Some(2), None, Some(1)]);
+    /// ```
+    pub fn new_with_dictionary(
+        keys_builder: PrimitiveBuilder<K>,
+        dictionary_values: &StringArray,
+    ) -> Result<Self> {
+        let mut values_builder = StringBuilder::with_capacity(
+            dictionary_values.len(),
+            dictionary_values.value_data().len(),
+        );
+        let mut map: HashMap<Box<[u8]>, K::Native> = HashMap::new();

Review comment:
       ```suggestion
           let mut map: HashMap<Box<[u8]>, K::Native> = HashMap::with_capacity(dictionary_values.len());
   ```

##########
File path: rust/arrow/src/array/builder.rs
##########
@@ -1334,6 +1334,58 @@ where
             map: HashMap::new(),
         }
     }
+
+    /// Creates a new `StringDictionaryBuilder` from a keys builder and a dictionary initialized with the given values.
+    /// The indices of those dictionary values are used as keys.
+    ///
+    /// # Example
+    ///
+    /// ```
+    /// use arrow::datatypes::Int16Type;
+    /// use arrow::array::{StringArray, StringDictionaryBuilder, PrimitiveBuilder};
+    /// use std::convert::TryFrom;
+    ///
+    /// let dictionary_values = StringArray::try_from(vec![None, Some("abc"), Some("def")]).unwrap();
+    ///
+    /// let mut builder = StringDictionaryBuilder::new_with_dictionary(PrimitiveBuilder::<Int16Type>::new(3), &dictionary_values).unwrap();
+    /// builder.append("def").unwrap();
+    /// builder.append_null().unwrap();
+    /// builder.append("abc").unwrap();
+    ///
+    /// let dictionary_array = builder.finish();
+    ///
+    /// let keys: Vec<Option<i16>> = dictionary_array.keys().collect();
+    ///
+    ///  assert_eq!(keys, vec![Some(2), None, Some(1)]);
+    /// ```
+    pub fn new_with_dictionary(
+        keys_builder: PrimitiveBuilder<K>,
+        dictionary_values: &StringArray,
+    ) -> Result<Self> {
+        let mut values_builder = StringBuilder::with_capacity(
+            dictionary_values.len(),
+            dictionary_values.value_data().len(),
+        );
+        let mut map: HashMap<Box<[u8]>, K::Native> = HashMap::new();
+        for i in 0..dictionary_values.len() {

Review comment:
       maybe better.
   ```suggestion
           0..dictionary_values.len()
           .for_each(|i| {
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me closed pull request #7226: ARROW-8791: [Rust] Allow creation of StringDictionaryBuilder with an existing array of dictionary values

Posted by GitBox <gi...@apache.org>.
nevi-me closed pull request #7226:
URL: https://github.com/apache/arrow/pull/7226


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org