You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/04/16 17:24:08 UTC

[GitHub] [arrow] tustvold commented on a change in pull request #10073: ARROW-12426: [Rust] Fix concatentation of arrow dictionaries

tustvold commented on a change in pull request #10073:
URL: https://github.com/apache/arrow/pull/10073#discussion_r615010933



##########
File path: rust/arrow/src/array/transform/mod.rs
##########
@@ -353,7 +429,23 @@ impl<'a> MutableArrayData<'a> {
         let null_bytes = bit_util::ceil(capacity, 8);
         let null_buffer = MutableBuffer::from_len_zeroed(null_bytes);
 
-        let extend_values = arrays.iter().map(|array| build_extend(array)).collect();
+        let extend_values = match &data_type {
+            DataType::Dictionary(_, _) => {
+                let mut next_offset = 0;
+                let extend_values: Result<Vec<_>> = arrays
+                    .iter()
+                    .map(|array| {
+                        let offset = next_offset;
+                        next_offset += array.child_data()[0].len();
+                        Ok(build_extend_dictionary(array, offset, next_offset)
+                            .ok_or(ArrowError::DictionaryKeyOverflowError)?)
+                    })
+                    .collect();
+
+                extend_values.expect("MutableArrayData::new is infallible")

Review comment:
       I'm not sure how best to handle this, the other option would be to handle dictionary concatenation in the concat kernel which is fallible - not sure which is better as I'm not all that familiar with the codebase yet




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org