You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/22 23:08:29 UTC

[GitHub] [arrow-rs] HaoYang670 commented on a diff in pull request #1720: Implementation string concat

HaoYang670 commented on code in PR #1720:
URL: https://github.com/apache/arrow-rs/pull/1720#discussion_r878852329


##########
arrow/src/compute/kernels/concat.rs:
##########
@@ -102,6 +102,25 @@ pub fn concat(arrays: &[&dyn Array]) -> Result<ArrayRef> {
     Ok(make_array(mutable.freeze()))
 }
 
+// Elementwise concatenation of StringArrays
+pub fn string_concat<Offset: OffsetSizeTrait>(

Review Comment:
   I agree with @tustvold to create a new compute kernel. Also, list, binary, fixed size list and fix size binary can all be supported.



##########
arrow/src/compute/kernels/concat.rs:
##########
@@ -102,6 +102,25 @@ pub fn concat(arrays: &[&dyn Array]) -> Result<ArrayRef> {
     Ok(make_array(mutable.freeze()))
 }
 
+// Elementwise concatenation of StringArrays
+pub fn string_concat<Offset: OffsetSizeTrait>(

Review Comment:
   Maybe we should support concatenate arbitrary number of arrays. We could file a follow-on issue to track it.



##########
arrow/src/compute/kernels/concat.rs:
##########
@@ -102,6 +102,25 @@ pub fn concat(arrays: &[&dyn Array]) -> Result<ArrayRef> {
     Ok(make_array(mutable.freeze()))
 }
 
+// Elementwise concatenation of StringArrays
+pub fn string_concat<Offset: OffsetSizeTrait>(
+    left: &GenericStringArray<Offset>,
+    right: &GenericStringArray<Offset>,
+) -> Result<GenericStringArray<Offset>> {
+    let left_bitmap = left.data().null_bitmap().unwrap();
+    let right_bitmap = right.data().null_bitmap().unwrap();
+    let concat_bitmap = (left_bitmap & right_bitmap).unwrap();
+    Ok((0..left.len().max(right.len()))
+        .map(|i| {
+            if concat_bitmap.is_set(i) {
+                Some(left.value(i).to_owned() + right.value(i))
+            } else {
+                None
+            }
+        })
+        .collect::<GenericStringArray<Offset>>())

Review Comment:
   We could file a follow-on issue to add benchmarks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org