You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/24 20:42:39 UTC

[GitHub] [arrow] yordan-pavlov opened a new pull request #7261: ARROW-8907: [Rust] Implement scalar comparison operations

yordan-pavlov opened a new pull request #7261:
URL: https://github.com/apache/arrow/pull/7261


   Currently comparing an array to a scalar / literal value using the comparison operations defined in the comparison kernel here 
   https://github.com/apache/arrow/blob/master/rust/arrow/src/compute/kernels/comparison.rs
   is very inefficient because:
   (1) an array with the scalar value repeated has to be created, taking time and wasting memory
   (2) time is spent during comparison to load the same literal values over and over
   
   This pull request implements scalar comparison functions and benchmarks indicate good performance gains:
   
   eq Float32 time: [938.54 us 950.28 us 962.65 us]
   eq scalar Float32 time: [836.47 us 838.47 us 840.78 us]
   eq Float32 simd time: [75.836 us 76.389 us 77.185 us]
   eq scalar Float32 simd time: [61.551 us 61.605 us 61.671 us]
   
   The benchmark results above show that the scalar comparison function is about 12% faster for non-SIMD and about 20% faster for SIMD comparison operations.
   And this is before accounting for creating the literal array. 
   In a more complex benchmark, the scalar comparison version is about 40% faster overall when we account for not having to create arrays of scalar / literal values.
   Here are the benchmark results:
   
   filter/filter with arrow SIMD (array) time: [647.77 us 675.12 us 706.69 us]
   filter/filter with arrow SIMD (scalar) time: [402.19 us 404.23 us 407.22 us]
   
   And here is the code for the benchmark:
   https://github.com/yordan-pavlov/arrow-benchmark/blob/master/rust/arrow_benchmark/src/main.rs#L230
   
   In future these scalar comparison operations could be used to improve the performance of DataFusion with expressions comparing arrays to scalar / literal values which happens often in real world queries.
   
   @paddyhoran  @andygrove  let me know what you think


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on a change in pull request #7261: ARROW-8907: [Rust] Implement scalar comparison operations

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #7261:
URL: https://github.com/apache/arrow/pull/7261#discussion_r429678616



##########
File path: rust/arrow/src/compute/kernels/comparison.rs
##########
@@ -66,6 +66,31 @@ macro_rules! compare_op {
     }};
 }
 
+macro_rules! compare_op_scalar {
+    ($left: expr, $right:expr, $op:expr) => {{
+        let null_bit_buffer =

Review comment:
       since you're comparing to a non-null scalar, you shouldn't need to compute a bitmap, you could use `$left`'s null buffer: `$left.data().null_buffer()`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me closed pull request #7261: ARROW-8907: [Rust] Implement scalar comparison operations

Posted by GitBox <gi...@apache.org>.
nevi-me closed pull request #7261:
URL: https://github.com/apache/arrow/pull/7261


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7261: ARROW-8907: [Rust] Implement scalar comparison operations

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7261:
URL: https://github.com/apache/arrow/pull/7261#issuecomment-633296713


   https://issues.apache.org/jira/browse/ARROW-8907


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] yordan-pavlov commented on a change in pull request #7261: ARROW-8907: [Rust] Implement scalar comparison operations

Posted by GitBox <gi...@apache.org>.
yordan-pavlov commented on a change in pull request #7261:
URL: https://github.com/apache/arrow/pull/7261#discussion_r429953750



##########
File path: rust/arrow/src/compute/kernels/comparison.rs
##########
@@ -66,6 +66,31 @@ macro_rules! compare_op {
     }};
 }
 
+macro_rules! compare_op_scalar {
+    ($left: expr, $right:expr, $op:expr) => {{
+        let null_bit_buffer =

Review comment:
       thanks @nevi-me , you are right, I have changed the scalar comparison code to clone the left null buffer




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org