You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by yi...@apache.org on 2022/08/05 03:00:53 UTC
[arrow] branch master updated: ARROW-17305: [C++] Avoid spending time in popcount in BitmapAnd benchmark (#13794)
This is an automated email from the ASF dual-hosted git repository.
yibocai pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 56e6caf07d ARROW-17305: [C++] Avoid spending time in popcount in BitmapAnd benchmark (#13794)
56e6caf07d is described below
commit 56e6caf07d77a4d4c79a20c558c2618efe7de830
Author: Antoine Pitrou <an...@python.org>
AuthorDate: Fri Aug 5 05:00:46 2022 +0200
ARROW-17305: [C++] Avoid spending time in popcount in BitmapAnd benchmark (#13794)
This was artificially limiting the reported performance of BitmapAnd.
Before:
```
--------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
--------------------------------------------------------------------------------------
BenchmarkBitmapAnd/32768/0 1708 ns 1708 ns 408579 bytes_per_second=17.8726G/s
BenchmarkBitmapAnd/131072/0 6968 ns 6965 ns 102223 bytes_per_second=17.5262G/s
BenchmarkBitmapAnd/32768/1 3982 ns 3981 ns 175136 bytes_per_second=7.66574G/s
BenchmarkBitmapAnd/131072/1 15574 ns 15569 ns 44988 bytes_per_second=7.8404G/s
BenchmarkBitmapAnd/32768/2 3999 ns 3998 ns 175021 bytes_per_second=7.63248G/s
BenchmarkBitmapAnd/131072/2 15589 ns 15585 ns 44844 bytes_per_second=7.83234G/s
```
After:
```
--------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
--------------------------------------------------------------------------------------
BenchmarkBitmapAnd/32768/0 732 ns 732 ns 967465 bytes_per_second=41.6736G/s
BenchmarkBitmapAnd/131072/0 3105 ns 3105 ns 229726 bytes_per_second=39.3198G/s
BenchmarkBitmapAnd/32768/1 2913 ns 2913 ns 240233 bytes_per_second=10.4774G/s
BenchmarkBitmapAnd/131072/1 11528 ns 11526 ns 60865 bytes_per_second=10.5912G/s
BenchmarkBitmapAnd/32768/2 2924 ns 2924 ns 236873 bytes_per_second=10.4378G/s
BenchmarkBitmapAnd/131072/2 11552 ns 11550 ns 60619 bytes_per_second=10.5691G/s
```
(I didn't check, but the compiler here probably auto-vectorizes the aligned code path)
Authored-by: Antoine Pitrou <an...@python.org>
Signed-off-by: Yibo Cai <yi...@arm.com>
---
cpp/src/arrow/util/bit_util_benchmark.cc | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/cpp/src/arrow/util/bit_util_benchmark.cc b/cpp/src/arrow/util/bit_util_benchmark.cc
index 8e95d01462..3bcb4ceea6 100644
--- a/cpp/src/arrow/util/bit_util_benchmark.cc
+++ b/cpp/src/arrow/util/bit_util_benchmark.cc
@@ -150,9 +150,7 @@ static void BenchmarkAndImpl(benchmark::State& state, DoAnd&& do_and) {
for (auto _ : state) {
do_and({bitmap_1, bitmap_2}, &bitmap_3);
- auto total =
- internal::CountSetBits(bitmap_3.data(), bitmap_3.offset(), bitmap_3.length());
- benchmark::DoNotOptimize(total);
+ benchmark::ClobberMemory();
}
state.SetBytesProcessed(state.iterations() * nbytes);
}