You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/09/11 14:09:30 UTC

[GitHub] [arrow] jorgecarleitao opened a new pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

jorgecarleitao opened a new pull request #8170:
URL: https://github.com/apache/arrow/pull/8170


   This PR improves the speed of the `take` kernel by using ~the dark magic~ buffers that @nevi-me thought me in another PR. 😀
   
   Like others PRs, the first commit fixes the benchmarks:
   * made them not benchmark array creation
   * removed benchmark of i8 since it uses the same code as i32,i64,if32,etc.
   * removed size 256 since 512 and 1024 is enough
   * added benchmark of take of strings
   
   ```
   git checkout c2aff01 && cargo bench --bench take_kernels && git checkout take_faster && cargo bench --bench take_kernels
   ```
   
   Result on my computer:
   
   ```
   take i32 512            time:   [2.9221 us 2.9282 us 2.9348 us]                          
                           change: [-46.697% -45.799% -44.703%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 12 outliers among 100 measurements (12.00%)
     5 (5.00%) high mild
     7 (7.00%) high severe
   
   take i32 1024           time:   [5.0237 us 5.0376 us 5.0548 us]                           
                           change: [-48.840% -48.433% -48.087%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     3 (3.00%) high mild
     5 (5.00%) high severe
   
   take bool 512           time:   [2.4694 us 2.4765 us 2.4843 us]                           
                           change: [-50.789% -50.390% -50.012%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     3 (3.00%) high mild
     3 (3.00%) high severe
   
   take bool 1024          time:   [4.0698 us 4.2407 us 4.4884 us]                            
                           change: [-51.026% -49.906% -48.535%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 9 outliers among 100 measurements (9.00%)
     5 (5.00%) high mild
     4 (4.00%) high severe
   
   take str 512            time:   [8.1593 us 8.9810 us 10.114 us]                          
                           change: [-66.908% -57.395% -45.151%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 10 outliers among 100 measurements (10.00%)
     9 (9.00%) high mild
     1 (1.00%) high severe
   
   take str 1024           time:   [12.098 us 12.151 us 12.208 us]                           
                           change: [-78.241% -75.656% -72.725%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     1 (1.00%) high mild
     4 (4.00%) high severe
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489763949



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       I got `[25, 0, 0, 0, 0]` instead of `[25, 0, 0, 0]`. Changing the string take to (https://github.com/jorgecarleitao/arrow/pull/7/commits/02eeb1a01906d1649fb00e723770dabd91874fd3) seems to fix the issue.
   
   It surfaces a soundness hole in how we create StringArray and LargeStringArray from ArrayRef.  When creating these 2 arrays, we only check that we have the right number of buffers, but we do not check the datatype or even whether the offsets are the correct size (32 vs 64 bit). If you look at `make_array`, you can see that if we pass it `DataType::Utf8` when we should be passing `DataType::LargeUtf8`, it will happily create a `StringArray` where we want a `LargeStringArray`.
   
   I'll open a JIRA for this issue




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r487071953



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. Please take a critical view on this, as I am uncertain about the details here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489741989



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);

Review comment:
       This might also be a non-issue as tests pass without changing the bits, and fail when changing them. We'll know if it's a problem if integration tests fail, so maybe we can address it later.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jhorstmann commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r487396427



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I think for strings it might make sense to also check the index validity inside the loop, to keep the resulting string data small. If the index is null it's value is probably 0 and in the worst case there could be very long string there in the original array.
   
   Another question, also for the primitive case, would be whether we can actually rely on the value being 0 when the bitmap says it is null. That might no longer be the case if you for example apply the `take` kernel twice.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#issuecomment-691124876


   https://issues.apache.org/jira/browse/ARROW-9971


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r487071953



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. However, do take a critical view on this issue, please!

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. Please take a critical view on this, as I am uncertain about the details here.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. However, do take a critical view on this issue, please!

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. Please take a critical view on this, as I am uncertain about the details here.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. However, do take a critical view on this issue, please!

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. Please take a critical view on this, as I am uncertain about the details here.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. However, do take a critical view on this issue, please!

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. Please take a critical view on this, as I am uncertain about the details here.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. However, do take a critical view on this issue, please!

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. Please take a critical view on this, as I am uncertain about the details here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489540125



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))))
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    Ok(Arc::new(BooleanArray::from(Arc::new(data))))
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
-        } else {
-            let ix = indices.value(i) as usize;
-            if a.is_null(ix) {
-                builder.append(false)?;
-            } else {
-                builder.append_value(a.value(ix))?;
-            }
-        }
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+    let null_slice = null_buf.data_mut();
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);

Review comment:
       The number of indices could be much smaller than the number of entries, in which case we are taking a significant amount of memory. No?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#issuecomment-691124876






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489622312



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);

Review comment:
       Is the change worth it? We can also try it on bool and numeric arrays.
   
   If you have 35 values in take, and num_bytes = 5, you can create a range `((35 - 1)..(num_bytes * 8)).for_each(|i| unset_bit);`I'll play with godbolt.org to see if the compiler is smart enough to use BMI instructions for the above range.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489609845



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))))
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    Ok(Arc::new(BooleanArray::from(Arc::new(data))))
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
-        } else {
-            let ix = indices.value(i) as usize;
-            if a.is_null(ix) {
-                builder.append(false)?;
-            } else {
-                builder.append_value(a.value(ix))?;
-            }
-        }
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+    let null_slice = null_buf.data_mut();
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);

Review comment:
       We'll never know the exact value AOT, but I'm wondering if allocating to n items is beneficial. Looks like we'd have to find a balance between under and over-allocation.
   
   What if we say (byte-len / valid-slots) * index-valid-slots? If the string lengths are uniform-ish, we would allocate a vec close enough to the actual.
   
   If we have an input array with 1000 values, and say 800 are valid; and we're only taking 200 values out of 250 total. If the length of the string data is 8000 (avg 10 bytes per slot), then we would allocate 2000 bytes, instead of the current 250. The eventual array will definitely be bigger than 250, and somewhere close to 2000.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489583545



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);

Review comment:
       About 13% improvement on my computer:
   
   ```
   take str 512            time:   [6.6736 us 6.6928 us 6.7177 us]                          
                           change: [-15.907% -12.819% -10.196%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 12 outliers among 100 measurements (12.00%)
     3 (3.00%) high mild
     9 (9.00%) high severe
   
   take str 1024           time:   [11.579 us 11.597 us 11.614 us]                           
                           change: [-14.361% -12.760% -11.299%] (p = 0.00 < 0.05)
                           Performance has improved.
   Found 11 outliers among 100 measurements (11.00%)
     1 (1.00%) low severe
     5 (5.00%) low mild
     1 (1.00%) high mild
     4 (4.00%) high severe
   ```
   
   I do not know how to address the `data_len % 8 != 0`, though: I can't find the APIs to handle bits like that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r487071953



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. However, do take a critical view on this issue, please!

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. Please take a critical view on this, as I am uncertain about the details here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489150948



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I agree. I think that this code is doing that, by returning `""` on the `else` branch of this if, right?
   
   I do not think we assume that a value is 0 when the null bitmap says it is a null. AFAIK a major advantage of this architecture of two fields (value and bitmap) is to allow operation vectorization on the value field without having to check for its nullability.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489159456



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);

Review comment:
       Out of curiousity, would it be better if we set bits to true, then unset the null ones? The typical case will be that arrows are dense, so we could likely reduce calls to bitset. I haven't looked at the assembly that's generated to see if the compiler uses bit-manipulation instructions, or generally how many instructions we spend on bit manipulation.
   
   I'm assuming that where `data_len % 8 != 0` we end up with the trailing bits set. If this causes problems with equality comparisons, we could reset those bits to `false`, if we start with `MutableBuffer::new(num_bytes).with_bitset(num_bytes, true)`.
   
   Do you mind trying out this approach and observing the impact on the benchmark?

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))))
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    Ok(Arc::new(BooleanArray::from(Arc::new(data))))
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
-        } else {
-            let ix = indices.value(i) as usize;
-            if a.is_null(ix) {
-                builder.append(false)?;
-            } else {
-                builder.append_value(a.value(ix))?;
-            }
-        }
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+    let null_slice = null_buf.data_mut();
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);

Review comment:
       Would we benefit from allocating this to the length of the input array's string buffer length? I'm assuming that `data_len` is not enough as strings will mostly have > 1 char.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))))
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);

Review comment:
       If above works, we could do same with the `null_buf` here

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))))
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    Ok(Arc::new(BooleanArray::from(Arc::new(data))))
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {

Review comment:
       do you mind adding a take for `LargeStringArray`?

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       I'll have a look at this. Almost time to start with $dayjob :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r487071953



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. However, do take a critical view on this issue, please!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489161935



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I see what you mean now, @jhorstmann , that is a good point.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r487071953



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. However, do take a critical view on this issue, please!

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. Please take a critical view on this, as I am uncertain about the details here.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. However, do take a critical view on this issue, please!

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. Please take a critical view on this, as I am uncertain about the details here.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. However, do take a critical view on this issue, please!

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       This test is semantically different(!). The new implementation takes all values, irrespetively of the nullabillity, which causes `data()` to be different. Please take a critical view on this, as I am uncertain about the details here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#issuecomment-691124876


   https://issues.apache.org/jira/browse/ARROW-9971


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove closed pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
andygrove closed pull request #8170:
URL: https://github.com/apache/arrow/pull/8170


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jhorstmann commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r487396427



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I think for strings it might make sense to also check the index validity inside the loop, to keep the resulting string data small. If the index is null it's value is probably 0 and in the worst case there could be very long string there in the original array.
   
   Another question, also for the primitive case, would be whether we can actually rely on the value being 0 when the bitmap says it is null. That might no longer be the case if you for example apply the `take` kernel twice.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I think for strings it might make sense to also check the index validity inside the loop, to keep the resulting string data small. If the index is null it's value is probably 0 and in the worst case there could be very long string there in the original array.
   
   Another question, also for the primitive case, would be whether we can actually rely on the value being 0 when the bitmap says it is null. That might no longer be the case if you for example apply the `take` kernel twice.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I think for strings it might make sense to also check the index validity inside the loop, to keep the resulting string data small. If the index is null it's value is probably 0 and in the worst case there could be very long string there in the original array.
   
   Another question, also for the primitive case, would be whether we can actually rely on the value being 0 when the bitmap says it is null. That might no longer be the case if you for example apply the `take` kernel twice.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I think for strings it might make sense to also check the index validity inside the loop, to keep the resulting string data small. If the index is null it's value is probably 0 and in the worst case there could be very long string there in the original array.
   
   Another question, also for the primitive case, would be whether we can actually rely on the value being 0 when the bitmap says it is null. That might no longer be the case if you for example apply the `take` kernel twice.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I think for strings it might make sense to also check the index validity inside the loop, to keep the resulting string data small. If the index is null it's value is probably 0 and in the worst case there could be very long string there in the original array.
   
   Another question, also for the primitive case, would be whether we can actually rely on the value being 0 when the bitmap says it is null. That might no longer be the case if you for example apply the `take` kernel twice.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I think for strings it might make sense to also check the index validity inside the loop, to keep the resulting string data small. If the index is null it's value is probably 0 and in the worst case there could be very long string there in the original array.
   
   Another question, also for the primitive case, would be whether we can actually rely on the value being 0 when the bitmap says it is null. That might no longer be the case if you for example apply the `take` kernel twice.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I think for strings it might make sense to also check the index validity inside the loop, to keep the resulting string data small. If the index is null it's value is probably 0 and in the worst case there could be very long string there in the original array.
   
   Another question, also for the primitive case, would be whether we can actually rely on the value being 0 when the bitmap says it is null. That might no longer be the case if you for example apply the `take` kernel twice.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jhorstmann commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r487396427



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I think for strings it might make sense to also check the index validity inside the loop, to keep the resulting string data small. If the index is null it's value is probably 0 and in the worst case there could be very long string there in the original array.
   
   Another question, also for the primitive case, would be whether we can actually rely on the value being 0 when the bitmap says it is null. That might no longer be the case if you for example apply the `take` kernel twice.

##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    return Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))));
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    return Ok(Arc::new(BooleanArray::from(Arc::new(data))));
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);
+    let mut length_so_far = 0;
+
+    offsets.push(length_so_far);
+    for i in 0..data_len {
+        let index = indices.value(i) as usize;
+
+        let s = if array.is_valid(index) {

Review comment:
       I think for strings it might make sense to also check the index validity inside the loop, to keep the resulting string data small. If the index is null it's value is probably 0 and in the worst case there could be very long string there in the original array.
   
   Another question, also for the primitive case, would be whether we can actually rely on the value being 0 when the bitmap says it is null. That might no longer be the case if you for example apply the `take` kernel twice.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489740709



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))))
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    Ok(Arc::new(BooleanArray::from(Arc::new(data))))
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {

Review comment:
       It might be a breaking change for users who don't import the whole `arrow::array::*` module. I had to import `StringArrayOps` in a few places on parquet and datafusion. I'm fine with the change as I prefer generics over macros, but we might have to check others' opinions first.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489164237



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       The `data` is still not the same, but the only difference now is on the null bitmap. Specifically, one has `[25]`, while the other has `[25,0,0,0]`. Do you know a way to pack null bitmaps, @nevi-me ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#issuecomment-694320708


   @nevi-me , thanks a lot for this instructive review! Always learning something new with you. 💯 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#issuecomment-691124876






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489603919



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))))
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    Ok(Arc::new(BooleanArray::from(Arc::new(data))))
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {

Review comment:
       Done. I introduced a new trait to avoid a macro. Let me know what you think. I think we may use it for other cases also.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489738006



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -166,42 +166,124 @@ fn take_primitive<T>(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRe
 where
     T: ArrowPrimitiveType,
 {
-    let mut builder = PrimitiveBuilder::<T>::new(indices.len());
-    let a = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            // populate with null if index is null
-            builder.append_null()?;
-        } else {
-            // get index value to use in looking up the value from `values`
-            let ix = indices.value(i) as usize;
-            if a.is_valid(ix) {
-                builder.append_value(a.value(ix))?;
-            } else {
-                builder.append_null()?;
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<PrimitiveArray<T>>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+
+    let null_slice = null_buf.data_mut();
+
+    let new_values: Vec<T::Native> = (0..data_len)
+        .map(|i| {
+            let index = indices.value(i) as usize;
+            if array.is_valid(index) {
+                bit_util::set_bit(null_slice, i);
+            }
+            array.value(index)
+        })
+        .collect();
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        T::get_data_type(),
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![Buffer::from(new_values.to_byte_slice())],
+        vec![],
+    );
+    Ok(Arc::new(PrimitiveArray::<T>::from(Arc::new(data))))
+}
+
+/// `take` implementation for boolean arrays
+fn take_boolean(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<BooleanArray>().unwrap();
+
+    let num_byte = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+    let mut val_buf = MutableBuffer::new(num_byte).with_bitset(num_byte, false);
+
+    let null_slice = null_buf.data_mut();
+    let val_slice = val_buf.data_mut();
+
+    (0..data_len).for_each(|i| {
+        let index = indices.value(i) as usize;
+        if array.is_valid(index) {
+            bit_util::set_bit(null_slice, i);
+            if array.value(index) {
+                bit_util::set_bit(val_slice, i);
             }
         }
-    }
-    Ok(Arc::new(builder.finish()) as ArrayRef)
+    });
+
+    let nulls = match indices.data_ref().null_buffer() {
+        Some(buffer) => buffer_bin_and(buffer, 0, &null_buf.freeze(), 0, indices.len()),
+        None => null_buf.freeze(),
+    };
+
+    let data = ArrayData::new(
+        DataType::Boolean,
+        indices.len(),
+        None,
+        Some(nulls),
+        0,
+        vec![val_buf.freeze()],
+        vec![],
+    );
+    Ok(Arc::new(BooleanArray::from(Arc::new(data))))
 }
 
 /// `take` implementation for string arrays
 fn take_string(values: &ArrayRef, indices: &UInt32Array) -> Result<ArrayRef> {
-    let mut builder = StringBuilder::new(indices.len());
-    let a = values.as_any().downcast_ref::<StringArray>().unwrap();
-    for i in 0..indices.len() {
-        if indices.is_null(i) {
-            builder.append(false)?;
-        } else {
-            let ix = indices.value(i) as usize;
-            if a.is_null(ix) {
-                builder.append(false)?;
-            } else {
-                builder.append_value(a.value(ix))?;
-            }
-        }
+    let data_len = indices.len();
+
+    let array = values.as_any().downcast_ref::<StringArray>().unwrap();
+
+    let num_bytes = bit_util::ceil(data_len, 8);
+    let mut null_buf = MutableBuffer::new(num_bytes).with_bitset(num_bytes, false);
+    let null_slice = null_buf.data_mut();
+
+    let mut offsets = Vec::with_capacity(data_len + 1);
+    let mut values = Vec::with_capacity(data_len);

Review comment:
       My concern might be invalid, benchmarking with a larger allocation is slower than the current method. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8170: ARROW-9971: [Rust] Improve speed of `take` by 2x-3x (change scaling with batch size)

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8170:
URL: https://github.com/apache/arrow/pull/8170#discussion_r489924044



##########
File path: rust/arrow/src/compute/kernels/take.rs
##########
@@ -425,22 +514,32 @@ mod tests {
     #[test]
     fn test_take_string() {
         let index = UInt32Array::from(vec![Some(3), None, Some(1), Some(3), Some(4)]);
-        let mut builder: StringBuilder = StringBuilder::new(6);
-        builder.append_value("one").unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("three").unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let array = Arc::new(builder.finish()) as ArrayRef;
-        let a = take(&array, &index, None).unwrap();
-        assert_eq!(a.len(), index.len());
-        builder.append_value("four").unwrap();
-        builder.append_null().unwrap();
-        builder.append_null().unwrap();
-        builder.append_value("four").unwrap();
-        builder.append_value("five").unwrap();
-        let b = builder.finish();
-        assert_eq!(a.data(), b.data());
+
+        let array = StringArray::from(vec![
+            Some("one"),
+            None,
+            Some("three"),
+            Some("four"),
+            Some("five"),
+        ]);
+        let array = Arc::new(array) as ArrayRef;
+
+        let actual = take(&array, &index, None).unwrap();
+        assert_eq!(actual.len(), index.len());
+
+        let actual = actual.as_any().downcast_ref::<StringArray>().unwrap();
+
+        let expected =
+            StringArray::from(vec![Some("four"), None, None, Some("four"), Some("five")]);
+
+        for i in 0..index.len() {

Review comment:
       Good point with the DataType, and great catch!
   
   It may have been my mistake on the number of zeros. Also, I think that this was solved by starting with `true` bits, as the builders/from also start with those.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org