You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Jacob Barrett <jb...@pivotal.io> on 2018/05/08 03:46:47 UTC
Geode Native Benchmarking
I got sucked into Google Benchmark [1], a C++ benchmark framework, and
decided to put together a POC benchmark project for Geode Native based on
it. You can find it in apache/geode-native PR 293 [2]. There are two new
projects, cpp-benchmark and cpp-integration-benchmark.
The cpp-benchmark is for unit or micro benchmarks that do not require a
Geode server to test, like a unit test.
The cpp-itegration-benchmark is then obviously for integration style
benchmarks that require a Geode server. It shares the new single process
integration framework classes as the new integration tests. The integration
tests, benchmarks and framework, have been refactored and moved to make
this relationship clear.
Please consider giving this some review and your blessing if you think
there is benefit in integrating it into the trunk.
Skip this part if you aren't interested in some benchmark results from each
platform.
As warned, here are some quick benchmarks from the sample included in this
PR that tests the time spent taking a Java String compatible hash code of a
C++ std::string in UTF-8 vs. a std::u16string in UTF-16. Since the Java
String hash code is computed on the individual UTF-16 code units
encapsulate din Java String, we must convert from UTF-8 to UTF-16 to hash
the std::string UTF-8. We have not optimized this to calculate the hash
inline with this conversion but rather take a brute force convert to UTF-16
then hash the UTF-16. This benchmark shows us that optimizing this method
could give us significant hash performance somewhere between the UTF-8 and
UTF-16 benchmark values. The benchmark name is formatted as
[fixture]/[test]/[variant], for example GeodeHashBM/std_string/8 is the
GeodeHashBM fixture (think test fixture), std_string test, with
8 characters. This is achieved by defining the test like this:
BENCHMARK_DEFINE_F(GeodeHashBM, std_string)(benchmark::State& state) {
std::string x(state.range(0), 'x');
for (auto _ : state) {
int hashcode;
benchmark::DoNotOptimize(hashcode = geode_hash<std::string>{}(x));
}
}
BENCHMARK_REGISTER_F(GeodeHashBM, std_string)->Range(8, 8 << 10);
Please read the details of Google Benchmark to understand how it does what
it it does or accept that it does some magic to warm up the code and then
runs the bit of code inside the for-range loop in groups of many iterations
(seen in the results below) to calculate a mean time for each individual
execution of that block.
The the results are...
*Windows - AWS c5.2xlarge*
------------------------
Run on (8 X 3000 MHz CPU s)
CPU Caches:
L1 Data 32K (x4)
L1 Instruction 32K (x4)
L2 Unified 1048K (x4)
L3 Unified 25952K (x1)
----------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------
GeodeHashBM/std_string/8 28353 ns 28250 ns 24889
GeodeHashBM/std_string/64 27747 ns 27623 ns 24889
GeodeHashBM/std_string/512 31941 ns 32087 ns 22400
GeodeHashBM/std_string/4096 62409 ns 61384 ns 11200
GeodeHashBM/std_string/8192 98233 ns 97656 ns 6400
GeodeHashBM/std_u16string/8 7 ns 7 ns 89600000
GeodeHashBM/std_u16string/64 59 ns 59 ns 11200000
GeodeHashBM/std_u16string/512 593 ns 600 ns 1120000
GeodeHashBM/std_u16string/4096 4816 ns 4757 ns 144516
GeodeHashBM/std_u16string/8192 9641 ns 9835 ns 74667
We can see here that windows has really poor performance converting from
UTF-8 to UTF-16 when compared to other results below. This is likely due to
bugs in the C++ runtime they ship that are supposed to be addressed in the
2019 release.
Linux - AWS c5.2xlarge
----------------------
Run on (8 X 3000 MHz CPU s)
CPU Caches:
L1 Data 32K (x4)
L1 Instruction 32K (x4)
L2 Unified 1024K (x4)
L3 Unified 25344K (x1)
----------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------
GeodeHashBM/std_string/8 116 ns 116 ns 6042733
GeodeHashBM/std_string/64 488 ns 488 ns 1434877
GeodeHashBM/std_string/512 2612 ns 2612 ns 268202
GeodeHashBM/std_string/4096 19639 ns 19639 ns 35640
GeodeHashBM/std_string/8192 39025 ns 39025 ns 17930
GeodeHashBM/std_u16string/8 5 ns 5 ns 133394926
GeodeHashBM/std_u16string/64 47 ns 47 ns 14862333
GeodeHashBM/std_u16string/512 449 ns 449 ns 1558067
GeodeHashBM/std_u16string/4096 3628 ns 3628 ns 192958
GeodeHashBM/std_u16string/8192 7260 ns 7261 ns 96415
*2013 MacBook Pro*
-------------------
Run on (8 X 2600 MHz CPU s)
CPU Caches:
L1 Data 32K (x4)
L1 Instruction 32K (x4)
L2 Unified 262K (x4)
L3 Unified 6291K (x1)
----------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------
GeodeHashBM/std_string/8 251 ns 250 ns 2941424
GeodeHashBM/std_string/64 420 ns 420 ns 1528121
GeodeHashBM/std_string/512 1618 ns 1618 ns 423334
GeodeHashBM/std_string/4096 11548 ns 11546 ns 59040
GeodeHashBM/std_string/8192 23448 ns 23433 ns 29394
GeodeHashBM/std_u16string/8 5 ns 5 ns 100000000
GeodeHashBM/std_u16string/64 50 ns 50 ns 13873749
GeodeHashBM/std_u16string/512 430 ns 430 ns 1639302
GeodeHashBM/std_u16string/4096 3441 ns 3441 ns 198727
GeodeHashBM/std_u16string/8192 6856 ns 6854 ns 92481
*Solaris x86*
-----------
Run on (40 X 1200 MHz CPU s)
----------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------
GeodeHashBM/std_string/8 276 ns 276 ns 2651445
GeodeHashBM/std_string/64 622 ns 622 ns 1126144
GeodeHashBM/std_string/512 3140 ns 3139 ns 222032
GeodeHashBM/std_string/4096 23368 ns 23352 ns 30304
GeodeHashBM/std_string/8192 46586 ns 46559 ns 15309
GeodeHashBM/std_u16string/8 9 ns 9 ns 77951870
GeodeHashBM/std_u16string/64 54 ns 54 ns 12807846
GeodeHashBM/std_u16string/512 467 ns 467 ns 1531799
GeodeHashBM/std_u16string/4096 3764 ns 3763 ns 183405
GeodeHashBM/std_u16string/8192 7407 ns 7405 ns 95226
*Solaris SPARC*
-------------
Run on (128 X 4267 MHz CPU s)
----------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------
GeodeHashBM/std_string/8 771 ns 771 ns 844136
GeodeHashBM/std_string/64 1446 ns 1446 ns 492403
GeodeHashBM/std_string/512 6058 ns 6058 ns 116836
GeodeHashBM/std_string/4096 43661 ns 43656 ns 16377
GeodeHashBM/std_string/8192 85366 ns 85367 ns 8299
GeodeHashBM/std_u16string/8 8 ns 8 ns 86601509
GeodeHashBM/std_u16string/64 47 ns 47 ns 14928524
GeodeHashBM/std_u16string/512 374 ns 374 ns 1892485
GeodeHashBM/std_u16string/4096 2950 ns 2950 ns 242241
GeodeHashBM/std_u16string/8192 5837 ns 5837 ns 121206
Again we see Solaris, especially SPARC, has rather poor conversion from
UTF-8 to UTF-16 in the C++ runtime.
And just in case you ever doubted that a debug build and release build
could be that different...
*Windows - AWS c5.2xlarge*
------------------------
Run on (8 X 3000 MHz CPU s)
CPU Caches:
L1 Data 32K (x4)
L1 Instruction 32K (x4)
L2 Unified 1048K (x4)
L3 Unified 25952K (x1)
****WARNING*** Library was built as DEBUG. Timings may be affected.*
----------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------
GeodeHashBM/std_string/8 142896 ns 141246 ns 4978
GeodeHashBM/std_string/64 160902 ns 161122 ns 4073
GeodeHashBM/std_string/512 258949 ns 260911 ns 2635
GeodeHashBM/std_string/4096 1034958 ns 1045850 ns 747
GeodeHashBM/std_string/8192 1941581 ns 1902174 ns 345
*GeodeHashBM/std_u16string/8 1533 ns 1535 ns 448000*
GeodeHashBM/std_u16string/64 8286 ns 8371 ns 89600
GeodeHashBM/std_u16string/512 62726 ns 62779 ns 11200
GeodeHashBM/std_u16string/4096 498803 ns 488281 ns 1120
GeodeHashBM/std_u16string/8192 997531 ns 976563 ns 640
Compare this debug run of 1535ns to the release run of 7ns for UTF-16
string of 8 characters. Ouch!
-Jake
[1] https://github.com/google/benchmark
[2] https://github.com/apache/geode-native/pull/293