You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/24 11:04:14 UTC

[GitHub] [arrow] jorgecarleitao opened a new pull request #9004: ARROW-11025: [Rust] Fixed bench for binary boolean kernels

jorgecarleitao opened a new pull request #9004:
URL: https://github.com/apache/arrow/pull/9004


   The benches were running array creation inside the bench itself, causing the whole bench be dictated by how fast we can create an array, not how fast the boolean kernel takes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io commented on pull request #9004: ARROW-11025: [Rust] Fixed bench for binary boolean kernels

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #9004:
URL: https://github.com/apache/arrow/pull/9004#issuecomment-750851898


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9004?src=pr&el=h1) Report
   > Merging [#9004](https://codecov.io/gh/apache/arrow/pull/9004?src=pr&el=desc) (56e0418) into [master](https://codecov.io/gh/apache/arrow/commit/1ecef42bb9fb9e91f0fb04c7d5a1c3be58390025?el=desc) (1ecef42) will **decrease** coverage by `0.00%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9004/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9004?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9004      +/-   ##
   ==========================================
   - Coverage   82.65%   82.65%   -0.01%     
   ==========================================
     Files         200      200              
     Lines       49795    49795              
   ==========================================
   - Hits        41159    41158       -1     
   - Misses       8636     8637       +1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9004?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/9004/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `95.24% <0.00%> (-0.20%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9004?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9004?src=pr&el=footer). Last update [1ecef42...56e0418](https://codecov.io/gh/apache/arrow/pull/9004?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9004: ARROW-11025: [Rust] Fixed bench for binary boolean kernels

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9004:
URL: https://github.com/apache/arrow/pull/9004#discussion_r548593147



##########
File path: rust/arrow/benches/boolean_kernels.rs
##########
@@ -19,48 +19,46 @@
 extern crate criterion;
 use criterion::Criterion;
 
+use rand::distributions::{Distribution, Standard};
+use rand::Rng;
+
+use arrow::util::test_util::seedable_rng;
+
 extern crate arrow;
 
 use arrow::array::*;
 use arrow::compute::kernels::boolean as boolean_kernels;
 
-///  Helper function to create arrays
-fn create_boolean_array(size: usize) -> BooleanArray {
-    let mut builder = BooleanBuilder::new(size);
-    for i in 0..size {
-        if i % 2 == 0 {
-            builder.append_value(true).unwrap();
-        } else {
-            builder.append_value(false).unwrap();
-        }
-    }
-    builder.finish()
+fn create_boolean(size: usize) -> BooleanArray
+where
+    Standard: Distribution<bool>,
+{

Review comment:
       👍 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb closed pull request #9004: ARROW-11025: [Rust] Fixed bench for binary boolean kernels

Posted by GitBox <gi...@apache.org>.
alamb closed pull request #9004:
URL: https://github.com/apache/arrow/pull/9004


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #9004: ARROW-11025: [Rust] Fixed bench for binary boolean kernels

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9004:
URL: https://github.com/apache/arrow/pull/9004#discussion_r548596025



##########
File path: rust/arrow/benches/boolean_kernels.rs
##########
@@ -19,48 +19,46 @@
 extern crate criterion;
 use criterion::Criterion;
 
+use rand::distributions::{Distribution, Standard};
+use rand::Rng;
+
+use arrow::util::test_util::seedable_rng;
+
 extern crate arrow;
 
 use arrow::array::*;
 use arrow::compute::kernels::boolean as boolean_kernels;
 
-///  Helper function to create arrays
-fn create_boolean_array(size: usize) -> BooleanArray {
-    let mut builder = BooleanBuilder::new(size);
-    for i in 0..size {
-        if i % 2 == 0 {
-            builder.append_value(true).unwrap();
-        } else {
-            builder.append_value(false).unwrap();
-        }
-    }
-    builder.finish()
+fn create_boolean(size: usize) -> BooleanArray
+where
+    Standard: Distribution<bool>,
+{
+    seedable_rng()
+        .sample_iter(&Standard)
+        .take(size)
+        .map(Some)
+        .collect()
 }
 
-/// Benchmark for `AND`
-fn bench_and(size: usize) {
-    let buffer_a = create_boolean_array(size);
-    let buffer_b = create_boolean_array(size);
-    criterion::black_box(boolean_kernels::and(&buffer_a, &buffer_b).unwrap());
+fn bench_and(lhs: &BooleanArray, rhs: &BooleanArray) {
+    criterion::black_box(boolean_kernels::and(lhs, rhs).unwrap());
 }
 
-/// Benchmark for `OR`
-fn bench_or(size: usize) {
-    let buffer_a = create_boolean_array(size);
-    let buffer_b = create_boolean_array(size);
-    criterion::black_box(boolean_kernels::or(&buffer_a, &buffer_b).unwrap());
+fn bench_or(lhs: &BooleanArray, rhs: &BooleanArray) {
+    criterion::black_box(boolean_kernels::or(lhs, rhs).unwrap());
 }
 
-/// Benchmark for `NOT`
-fn bench_not(size: usize) {
-    let buffer = create_boolean_array(size);
-    criterion::black_box(boolean_kernels::not(&buffer).unwrap());
+fn bench_not(array: &BooleanArray) {
+    criterion::black_box(boolean_kernels::not(&array).unwrap());
 }
 
 fn add_benchmark(c: &mut Criterion) {
-    c.bench_function("and", |b| b.iter(|| bench_and(512)));
-    c.bench_function("or", |b| b.iter(|| bench_or(512)));
-    c.bench_function("not", |b| b.iter(|| bench_not(512)));
+    let size = 2usize.pow(15);
+    let array1 = create_boolean(size);

Review comment:
       This is much bigger than it used to be (not bad per se, just an observation). I think we should come up with some reasonable sizes across the benchmarks which correspond to real world sizes (e.g. as used in DataFusion).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9004: ARROW-11025: [Rust] Fixed bench for binary boolean kernels

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9004:
URL: https://github.com/apache/arrow/pull/9004#issuecomment-750850846


   https://issues.apache.org/jira/browse/ARROW-11025


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org