You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/08/25 16:51:33 UTC

[GitHub] [arrow] jhorstmann opened a new pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

jhorstmann opened a new pull request #8051:
URL: https://github.com/apache/arrow/pull/8051


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] paddyhoran closed pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

Posted by GitBox <gi...@apache.org>.
paddyhoran closed pull request #8051:
URL: https://github.com/apache/arrow/pull/8051


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8051:
URL: https://github.com/apache/arrow/pull/8051#issuecomment-680153274


   https://issues.apache.org/jira/browse/ARROW-9853


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jhorstmann commented on a change in pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on a change in pull request #8051:
URL: https://github.com/apache/arrow/pull/8051#discussion_r476688318



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -657,4 +694,71 @@ mod tests {
             vec![None],
         );
     }
+
+    #[test]
+    fn test_take_dict() {
+        let keys_builder = Int16Builder::new(8);
+        let values_builder = StringBuilder::new(4);
+
+        let mut dict_builder = StringDictionaryBuilder::new(keys_builder, values_builder);
+
+        dict_builder.append("foo").unwrap();
+        dict_builder.append("bar").unwrap();
+        dict_builder.append("").unwrap();
+        dict_builder.append_null().unwrap();
+        dict_builder.append("foo").unwrap();
+        dict_builder.append("bar").unwrap();
+        dict_builder.append("bar").unwrap();
+        dict_builder.append("foo").unwrap();
+
+        let array = dict_builder.finish();
+
+        let expected_values = StringArray::try_from(vec!["foo", "bar", ""]).unwrap();
+        assert_eq!(
+            &expected_values,
+            array
+                .values()
+                .as_any()
+                .downcast_ref::<StringArray>()
+                .unwrap()
+        );
+
+        let array_ref: Arc<dyn Array> = Arc::new(array);
+
+        let indices = UInt32Array::from(vec![
+            Some(0),
+            Some(7),
+            None,
+            Some(5),
+            Some(6),
+            Some(2),
+            Some(3),
+        ]);
+
+        let result = take(&array_ref, &indices, None).unwrap();
+        let result = result
+            .as_any()
+            .downcast_ref::<DictionaryArray<Int16Type>>()
+            .unwrap();
+
+        let result_values: StringArray = result.values().data().into();
+
+        // dictionary values should stay the same
+        assert_eq!(&expected_values, &result_values);
+
+        let result_keys: Int16Array = result.keys().collect::<Vec<_>>().into();
+        let expected_keys = Int16Array::from(vec![
+            Some(0),
+            Some(0),
+            None,
+            Some(1),

Review comment:
       Good point, I added some comments and also moved the assertion of the dictionary values closer to the assertion of the keys so hopefully their relation is now more visible.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jhorstmann commented on pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on pull request #8051:
URL: https://github.com/apache/arrow/pull/8051#issuecomment-680176701


   @andygrove @jorgecarleitao can I ask for your review?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8051: ARROW-9853: [RUST] Implement take kernel for dictionary arrays

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8051:
URL: https://github.com/apache/arrow/pull/8051#discussion_r476637153



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -657,4 +694,71 @@ mod tests {
             vec![None],
         );
     }
+
+    #[test]
+    fn test_take_dict() {
+        let keys_builder = Int16Builder::new(8);
+        let values_builder = StringBuilder::new(4);
+
+        let mut dict_builder = StringDictionaryBuilder::new(keys_builder, values_builder);
+
+        dict_builder.append("foo").unwrap();
+        dict_builder.append("bar").unwrap();
+        dict_builder.append("").unwrap();
+        dict_builder.append_null().unwrap();
+        dict_builder.append("foo").unwrap();
+        dict_builder.append("bar").unwrap();
+        dict_builder.append("bar").unwrap();
+        dict_builder.append("foo").unwrap();
+
+        let array = dict_builder.finish();
+
+        let expected_values = StringArray::try_from(vec!["foo", "bar", ""]).unwrap();
+        assert_eq!(
+            &expected_values,
+            array
+                .values()
+                .as_any()
+                .downcast_ref::<StringArray>()
+                .unwrap()
+        );
+
+        let array_ref: Arc<dyn Array> = Arc::new(array);
+
+        let indices = UInt32Array::from(vec![
+            Some(0),
+            Some(7),
+            None,
+            Some(5),
+            Some(6),
+            Some(2),
+            Some(3),
+        ]);
+
+        let result = take(&array_ref, &indices, None).unwrap();
+        let result = result
+            .as_any()
+            .downcast_ref::<DictionaryArray<Int16Type>>()
+            .unwrap();
+
+        let result_values: StringArray = result.values().data().into();
+
+        // dictionary values should stay the same
+        assert_eq!(&expected_values, &result_values);
+
+        let result_keys: Int16Array = result.keys().collect::<Vec<_>>().into();
+        let expected_keys = Int16Array::from(vec![
+            Some(0),
+            Some(0),
+            None,
+            Some(1),

Review comment:
       For clarity, I would try to describe in comments how we arrive to one of these values, e.g.
   
   ```
   // index number 5 corresponds to "bar", that first appears in index 1, thus 1
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org