You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Tobias Zagorni (Jira)" <ji...@apache.org> on 2022/05/17 19:19:00 UTC

[jira] [Created] (ARROW-16599) [C++] Implementation of ExecuteScalarExpressionOverhead benchmarks without arrow for comparision

Tobias Zagorni created ARROW-16599:
--------------------------------------

             Summary: [C++] Implementation of ExecuteScalarExpressionOverhead benchmarks without arrow for comparision
                 Key: ARROW-16599
                 URL: https://issues.apache.org/jira/browse/ARROW-16599
             Project: Apache Arrow
          Issue Type: Sub-task
          Components: C++
            Reporter: Tobias Zagorni
            Assignee: Tobias Zagorni


The ExecuteScalarExpressionOverhead group of benchmarks for now gives us values we can compare to different batch sizes, or to different expressions. But we don't really see how well arrow does compared to what is possible in general.

The simple_expression and (negate x) complex_expression (x>0 and x<20) benchmarks, which perform an actual operation on data, can be implemented in pure C++ for comparison.

I implemented complex_expression benchmark using technically unnecessary intermediate buffers for the > and < operator results, to match what happens in the arrow expression.

What may seem unfair is that I currently re-use the input/output/intermediate buffers over all iterations. I also tried using new and delete each time, but could not measure a difference in performance. Reusing allowes to use std::vector for sightly cleaner code. Re-creating a vector each time would results in a lot of overhead initializing the vector values and is therefore not useful.

Example output:

{{ExecuteScalarExpressionOverhead/complex_expression/rows_per_batch:1000/real_time/threads:1        3328161 ns      3326213 ns         1277 batches_per_second=300.466k/s rows_per_second=300.466M/s 
ExecuteScalarExpressionOverhead/complex_expression/rows_per_batch:1000/real_time/threads:16        754880 ns     11940432 ns         5680 batches_per_second=1.32471M/s rows_per_second=1.32471G/s 
ExecuteScalarExpressionOverhead/complex_expression/rows_per_batch:10000/real_time/threads:1       1370993 ns      1370182 ns         3047 batches_per_second=72.9398k/s rows_per_second=729.398M/s 
ExecuteScalarExpressionOverhead/complex_expression/rows_per_batch:10000/real_time/threads:16       213412 ns      3377187 ns        20608 batches_per_second=468.578k/s rows_per_second=4.68578G/s 
ExecuteScalarExpressionOverhead/complex_expression/rows_per_batch:100000/real_time/threads:1      1194552 ns      1192163 ns         3494 batches_per_second=8.37134k/s rows_per_second=837.134M/s 
ExecuteScalarExpressionOverhead/complex_expression/rows_per_batch:100000/real_time/threads:16      193390 ns      3047981 ns        22576 batches_per_second=51.709k/s rows_per_second=5.1709G/s 
ExecuteScalarExpressionOverhead/complex_expression/rows_per_batch:1000000/real_time/threads:1     1243416 ns      1240591 ns         3325 batches_per_second=804.236/s rows_per_second=804.236M/s 
ExecuteScalarExpressionOverhead/complex_expression/rows_per_batch:1000000/real_time/threads:16     449956 ns      7057594 ns         9216 batches_per_second=2.22244k/s rows_per_second=2.22244G/s 
ExecuteScalarExpressionOverhead/simple_expression/rows_per_batch:1000/real_time/threads:1         1153192 ns      1151060 ns         3580 batches_per_second=867.158k/s rows_per_second=867.158M/s 
ExecuteScalarExpressionOverhead/simple_expression/rows_per_batch:1000/real_time/threads:16         297876 ns      4705702 ns        15152 batches_per_second=3.3571M/s rows_per_second=3.3571G/s 
ExecuteScalarExpressionOverhead/simple_expression/rows_per_batch:10000/real_time/threads:1         519083 ns       518087 ns         8027 batches_per_second=192.647k/s rows_per_second=1.92647G/s 
ExecuteScalarExpressionOverhead/simple_expression/rows_per_batch:10000/real_time/threads:16         70329 ns      1106796 ns        62320 batches_per_second=1.42189M/s rows_per_second=14.2189G/s 
ExecuteScalarExpressionOverhead/simple_expression/rows_per_batch:100000/real_time/threads:1        420460 ns       419404 ns         9878 batches_per_second=23.7835k/s rows_per_second=2.37835G/s 
ExecuteScalarExpressionOverhead/simple_expression/rows_per_batch:100000/real_time/threads:16        75645 ns      1189925 ns        56864 batches_per_second=132.196k/s rows_per_second=13.2196G/s 
ExecuteScalarExpressionOverhead/simple_expression/rows_per_batch:1000000/real_time/threads:1       425360 ns       424499 ns         9404 batches_per_second=2.35095k/s rows_per_second=2.35095G/s 
ExecuteScalarExpressionOverhead/simple_expression/rows_per_batch:1000000/real_time/threads:16     1057920 ns     16308254 ns         3984 batches_per_second=945.251/s rows_per_second=945.251M/s
ExecuteScalarExpressionBaseline<ComplexExpressionBaseline>/rows_per_batch:1000/real_time/threads:1         876620 ns       876032 ns         4787 batches_per_second=1.14075M/s rows_per_second=1.14075G/s}}

{{baseline:
ExecuteScalarExpressionBaseline<ComplexExpressionBaseline>/rows_per_batch:1000/real_time/threads:16        106371 ns      1657205 ns        41536 batches_per_second=9.40109M/s rows_per_second=9.40109G/s 
ExecuteScalarExpressionBaseline<ComplexExpressionBaseline>/rows_per_batch:10000/real_time/threads:1        993787 ns       993262 ns         4219 batches_per_second=100.625k/s rows_per_second=1006.25M/s 
ExecuteScalarExpressionBaseline<ComplexExpressionBaseline>/rows_per_batch:10000/real_time/threads:16       114770 ns      1812652 ns        37520 batches_per_second=871.311k/s rows_per_second=8.71311G/s 
ExecuteScalarExpressionBaseline<ComplexExpressionBaseline>/rows_per_batch:100000/real_time/threads:1       996150 ns       995562 ns         4209 batches_per_second=10.0386k/s rows_per_second=1003.86M/s 
ExecuteScalarExpressionBaseline<ComplexExpressionBaseline>/rows_per_batch:100000/real_time/threads:16      122580 ns      1936209 ns        35168 batches_per_second=81.5791k/s rows_per_second=8.15791G/s 
ExecuteScalarExpressionBaseline<ComplexExpressionBaseline>/rows_per_batch:1000000/real_time/threads:1      988198 ns       987316 ns         4231 batches_per_second=1011.94/s rows_per_second=1011.94M/s 
ExecuteScalarExpressionBaseline<ComplexExpressionBaseline>/rows_per_batch:1000000/real_time/threads:16     445864 ns      6984471 ns         9296 batches_per_second=2.24284k/s rows_per_second=2.24284G/s 
ExecuteScalarExpressionBaseline<SimpleExpressionBaseline>/rows_per_batch:1000/real_time/threads:1          362262 ns       361985 ns        11352 batches_per_second=2.76043M/s rows_per_second=2.76043G/s 
ExecuteScalarExpressionBaseline<SimpleExpressionBaseline>/rows_per_batch:1000/real_time/threads:16          40944 ns       646932 ns       105312 batches_per_second=24.4234M/s rows_per_second=24.4234G/s 
ExecuteScalarExpressionBaseline<SimpleExpressionBaseline>/rows_per_batch:10000/real_time/threads:1         375894 ns       375244 ns        11230 batches_per_second=266.032k/s rows_per_second=2.66032G/s 
ExecuteScalarExpressionBaseline<SimpleExpressionBaseline>/rows_per_batch:10000/real_time/threads:16         44526 ns       703275 ns        96704 batches_per_second=2.2459M/s rows_per_second=22.459G/s 
ExecuteScalarExpressionBaseline<SimpleExpressionBaseline>/rows_per_batch:100000/real_time/threads:1        377450 ns       376698 ns        11013 batches_per_second=26.4936k/s rows_per_second=2.64936G/s 
ExecuteScalarExpressionBaseline<SimpleExpressionBaseline>/rows_per_batch:100000/real_time/threads:16        67216 ns      1054881 ns        62400 batches_per_second=148.774k/s rows_per_second=14.8774G/s 
ExecuteScalarExpressionBaseline<SimpleExpressionBaseline>/rows_per_batch:1000000/real_time/threads:1       396841 ns       396078 ns        10461 batches_per_second=2.5199k/s rows_per_second=2.5199G/s 
ExecuteScalarExpressionBaseline<SimpleExpressionBaseline>/rows_per_batch:1000000/real_time/threads:16     1046650 ns     16071057 ns         4016 batches_per_second=955.429/s rows_per_second=955.429M/s

}}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)