You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Andrew Lamb (Jira)" <ji...@apache.org> on 2021/04/26 13:19:02 UTC

[jira] [Closed] (ARROW-8908) [Rust][DataFusion] improve performance of building literal arrays

     [ https://issues.apache.org/jira/browse/ARROW-8908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Lamb closed ARROW-8908.
------------------------------
    Resolution: Invalid

> [Rust][DataFusion] improve performance of building literal arrays
> -----------------------------------------------------------------
>
>                 Key: ARROW-8908
>                 URL: https://issues.apache.org/jira/browse/ARROW-8908
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust - DataFusion
>            Reporter: Yordan Pavlov
>            Priority: Major
>
> [~andygrove] I was doing some profiling and noticed a potential performance improvement described below
> NOTE: The issue described below would be irrelevant if it was possible to use scalar comparison operations in DataFusion as described here:
> https://issues.apache.org/jira/browse/ARROW-8907
> the `build_literal_array` function defined here https://github.com/apache/arrow/blob/master/rust/datafusion/src/execution/physical_plan/expressions.rs#L1204
> creates an array of literal values using a loop, but from benchmarks it appears creating an array from vec is much faster 
> (about 58 times faster when building an array with 100000 values).
> Here are the benchmark results:
> array builder/array from vec: time: [25.644 us 25.883 us 26.214 us]
> array builder/array from values: time: [1.4985 ms 1.5090 ms 1.5213 ms]
> here is the benchmark code:
> ```
> fn bench_array_builder(c: &mut Criterion) {
>  let array_len = 100000;
>  let mut count = 0;
>  let mut group = c.benchmark_group("array builder");
> group.bench_function("array from vec", |b| b.iter(|| {
>  let float_array: PrimitiveArray<Float32Type> = vec![1.0; array_len].into();
>  count = float_array.len();
>  }));
>  println!("built array with {} values", count);
> group.bench_function("array from values", |b| b.iter(|| {
>  // let float_array: PrimitiveArray<Float32Type> = build_literal_array(1.0, array_len);
>  let mut builder = PrimitiveBuilder::<Float32Type>::new(array_len);
>  for _ in 0..count {
>  &builder.append_value(1.0);
>  }
>  let float_array = builder.finish();
>  count = float_array.len();
>  }));
>  println!("built array with {} values", count);
> }
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)