You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/03/23 06:54:08 UTC
[GitHub] [incubator-pinot] mqliang opened a new issue #6714: Benchmark data table serialization logic and pre-allocate byte[] array if need be
mqliang opened a new issue #6714:
URL: https://github.com/apache/incubator-pinot/issues/6714
As @siddharthteotia pointed out in https://github.com/apache/incubator-pinot/pull/6710#discussion_r599240463_
> serialization functions first writes to a temporary output stream and then converts to byte array which is returned to the caller and written to the main stream. I think the reason for doing that is upfront we don't know the length of byte[] array to allocate.
> However, we can probably do different and it might be faster
>* Write a loop to go over each entry and keep a running sum of size
>* At the end of loop, allocate byte array of that size
>* Start another loop and go over each entry again and fill out the pre-allocated byte array.
>* Return the filled byte array
We need to benchmark this two serialization approach. If the proposed approach is better, will send a PR to address it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [incubator-pinot] mqliang commented on issue #6714: Benchmark data table serialization logic and pre-allocate byte[] array if need be
Posted by GitBox <gi...@apache.org>.
mqliang commented on issue #6714:
URL: https://github.com/apache/incubator-pinot/issues/6714#issuecomment-806403959
I write a benchmark here: https://github.com/mqliang/incubator-pinot/commit/7892423579b20dafcb5802a09f20f826377f6c39
The benchmark compares three serialization methods (serialize a typical metadata map):
* `temporaryOutputStream`: For each KV pair in metadata, first writes to a temporary output stream and then converts to byte array which is returned to the caller and written to the main stream
* `preAllocateByteArrayNative`:
* loop to go over each entry and keep a running sum of size
* At the end of loop, allocate byte array of that size
* Start another loop and go over each entry again and fill out the pre-allocated byte array.
* Return the filled byte array
* key and values are encoded two times during two loop
* `preAllocateByteArrayWithBytesCache`: same logic as `preAllocateByteArrayNative`, just add a cache to cache the encoded K/V so can be used in the second loop.
Here is the result:
```
# JMH version: 1.26
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=65146:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 1 iterations, 10 s each
# Measurement: 5 iterations, 30 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative
# Run progress: 0.00% complete, ETA 00:08:00
# Fork: 1 of 1
# Warmup Iteration 1: 552.178 us/op
Iteration 1: 519.531 us/op
·gc.alloc.rate: 3270.480 MB/sec
·gc.alloc.rate.norm: 1811608.009 B/op
·gc.churn.PS_Eden_Space: 3275.114 MB/sec
·gc.churn.PS_Eden_Space.norm: 1814175.318 B/op
·gc.churn.PS_Survivor_Space: 0.558 MB/sec
·gc.churn.PS_Survivor_Space.norm: 309.168 B/op
·gc.count: 525.000 counts
·gc.time: 261.000 ms
Iteration 2: 524.659 us/op
·gc.alloc.rate: 3238.871 MB/sec
·gc.alloc.rate.norm: 1811608.011 B/op
·gc.churn.PS_Eden_Space: 3242.901 MB/sec
·gc.churn.PS_Eden_Space.norm: 1813862.347 B/op
·gc.churn.PS_Survivor_Space: 0.563 MB/sec
·gc.churn.PS_Survivor_Space.norm: 314.968 B/op
·gc.count: 516.000 counts
·gc.time: 263.000 ms
Iteration 3: 526.323 us/op
·gc.alloc.rate: 3228.230 MB/sec
·gc.alloc.rate.norm: 1811608.008 B/op
·gc.churn.PS_Eden_Space: 3232.024 MB/sec
·gc.churn.PS_Eden_Space.norm: 1813736.682 B/op
·gc.churn.PS_Survivor_Space: 0.471 MB/sec
·gc.churn.PS_Survivor_Space.norm: 264.539 B/op
·gc.count: 470.000 counts
·gc.time: 254.000 ms
Iteration 4: 521.779 us/op
·gc.alloc.rate: 3256.320 MB/sec
·gc.alloc.rate.norm: 1811608.008 B/op
·gc.churn.PS_Eden_Space: 3261.433 MB/sec
·gc.churn.PS_Eden_Space.norm: 1814452.617 B/op
·gc.churn.PS_Survivor_Space: 0.560 MB/sec
·gc.churn.PS_Survivor_Space.norm: 311.772 B/op
·gc.count: 534.000 counts
·gc.time: 270.000 ms
Iteration 5: 524.474 us/op
·gc.alloc.rate: 3239.855 MB/sec
·gc.alloc.rate.norm: 1811608.008 B/op
·gc.churn.PS_Eden_Space: 3242.045 MB/sec
·gc.churn.PS_Eden_Space.norm: 1812832.659 B/op
·gc.churn.PS_Survivor_Space: 0.547 MB/sec
·gc.churn.PS_Survivor_Space.norm: 305.975 B/op
·gc.count: 483.000 counts
·gc.time: 255.000 ms
Result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative":
523.353 ±(99.9%) 10.345 us/op [Average]
(min, avg, max) = (519.531, 523.353, 526.323), stdev = 2.687
CI (99.9%): [513.008, 533.698] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.alloc.rate":
3246.751 ±(99.9%) 64.066 MB/sec [Average]
(min, avg, max) = (3228.230, 3246.751, 3270.480), stdev = 16.638
CI (99.9%): [3182.685, 3310.818] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.alloc.rate.norm":
1811608.009 ±(99.9%) 0.005 B/op [Average]
(min, avg, max) = (1811608.008, 1811608.009, 1811608.011), stdev = 0.001
CI (99.9%): [1811608.003, 1811608.014] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Eden_Space":
3250.704 ±(99.9%) 66.578 MB/sec [Average]
(min, avg, max) = (3232.024, 3250.704, 3275.114), stdev = 17.290
CI (99.9%): [3184.126, 3317.282] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Eden_Space.norm":
1813811.924 ±(99.9%) 2365.646 B/op [Average]
(min, avg, max) = (1812832.659, 1813811.924, 1814452.617), stdev = 614.351
CI (99.9%): [1811446.279, 1816177.570] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Survivor_Space":
0.540 ±(99.9%) 0.150 MB/sec [Average]
(min, avg, max) = (0.471, 0.540, 0.563), stdev = 0.039
CI (99.9%): [0.390, 0.690] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Survivor_Space.norm":
301.285 ±(99.9%) 80.118 B/op [Average]
(min, avg, max) = (264.539, 301.285, 314.968), stdev = 20.806
CI (99.9%): [221.166, 381.403] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.count":
2528.000 ±(99.9%) 0.001 counts [Sum]
(min, avg, max) = (470.000, 505.600, 534.000), stdev = 27.700
CI (99.9%): [2528.000, 2528.000] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.time":
1303.000 ±(99.9%) 0.001 ms [Sum]
(min, avg, max) = (254.000, 260.600, 270.000), stdev = 6.504
CI (99.9%): [1303.000, 1303.000] (assumes normal distribution)
# JMH version: 1.26
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=65146:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 1 iterations, 10 s each
# Measurement: 5 iterations, 30 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache
# Run progress: 33.33% complete, ETA 00:05:28
# Fork: 1 of 1
# Warmup Iteration 1: 390.616 us/op
Iteration 1: 375.676 us/op
·gc.alloc.rate: 3524.091 MB/sec
·gc.alloc.rate.norm: 1411608.008 B/op
·gc.churn.PS_Eden_Space: 3532.601 MB/sec
·gc.churn.PS_Eden_Space.norm: 1415016.587 B/op
·gc.churn.PS_Survivor_Space: 0.538 MB/sec
·gc.churn.PS_Survivor_Space.norm: 215.400 B/op
·gc.count: 458.000 counts
·gc.time: 248.000 ms
Iteration 2: 375.171 us/op
·gc.alloc.rate: 3528.907 MB/sec
·gc.alloc.rate.norm: 1411608.006 B/op
·gc.churn.PS_Eden_Space: 3534.356 MB/sec
·gc.churn.PS_Eden_Space.norm: 1413787.624 B/op
·gc.churn.PS_Survivor_Space: 0.494 MB/sec
·gc.churn.PS_Survivor_Space.norm: 197.609 B/op
·gc.count: 435.000 counts
·gc.time: 247.000 ms
Iteration 3: 373.233 us/op
·gc.alloc.rate: 3547.720 MB/sec
·gc.alloc.rate.norm: 1411608.005 B/op
·gc.churn.PS_Eden_Space: 3559.728 MB/sec
·gc.churn.PS_Eden_Space.norm: 1416385.929 B/op
·gc.churn.PS_Survivor_Space: 0.539 MB/sec
·gc.churn.PS_Survivor_Space.norm: 214.343 B/op
·gc.count: 462.000 counts
·gc.time: 247.000 ms
Iteration 4: 371.186 us/op
·gc.alloc.rate: 3567.068 MB/sec
·gc.alloc.rate.norm: 1411608.006 B/op
·gc.churn.PS_Eden_Space: 3566.702 MB/sec
·gc.churn.PS_Eden_Space.norm: 1411463.405 B/op
·gc.churn.PS_Survivor_Space: 0.597 MB/sec
·gc.churn.PS_Survivor_Space.norm: 236.411 B/op
·gc.count: 520.000 counts
·gc.time: 271.000 ms
Iteration 5: 370.738 us/op
·gc.alloc.rate: 3571.354 MB/sec
·gc.alloc.rate.norm: 1411608.005 B/op
·gc.churn.PS_Eden_Space: 3582.874 MB/sec
·gc.churn.PS_Eden_Space.norm: 1416161.234 B/op
·gc.churn.PS_Survivor_Space: 0.588 MB/sec
·gc.churn.PS_Survivor_Space.norm: 232.322 B/op
·gc.count: 509.000 counts
·gc.time: 262.000 ms
Result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache":
373.201 ±(99.9%) 8.639 us/op [Average]
(min, avg, max) = (370.738, 373.201, 375.676), stdev = 2.243
CI (99.9%): [364.562, 381.840] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.alloc.rate":
3547.828 ±(99.9%) 82.702 MB/sec [Average]
(min, avg, max) = (3524.091, 3547.828, 3571.354), stdev = 21.477
CI (99.9%): [3465.126, 3630.530] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.alloc.rate.norm":
1411608.006 ±(99.9%) 0.005 B/op [Average]
(min, avg, max) = (1411608.005, 1411608.006, 1411608.008), stdev = 0.001
CI (99.9%): [1411608.001, 1411608.011] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Eden_Space":
3555.252 ±(99.9%) 83.120 MB/sec [Average]
(min, avg, max) = (3532.601, 3555.252, 3582.874), stdev = 21.586
CI (99.9%): [3472.132, 3638.373] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Eden_Space.norm":
1414562.956 ±(99.9%) 7771.212 B/op [Average]
(min, avg, max) = (1411463.405, 1414562.956, 1416385.929), stdev = 2018.159
CI (99.9%): [1406791.744, 1422334.168] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Survivor_Space":
0.551 ±(99.9%) 0.162 MB/sec [Average]
(min, avg, max) = (0.494, 0.551, 0.597), stdev = 0.042
CI (99.9%): [0.389, 0.713] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Survivor_Space.norm":
219.217 ±(99.9%) 60.046 B/op [Average]
(min, avg, max) = (197.609, 219.217, 236.411), stdev = 15.594
CI (99.9%): [159.171, 279.263] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.count":
2384.000 ±(99.9%) 0.001 counts [Sum]
(min, avg, max) = (435.000, 476.800, 520.000), stdev = 36.134
CI (99.9%): [2384.000, 2384.000] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.time":
1275.000 ±(99.9%) 0.001 ms [Sum]
(min, avg, max) = (247.000, 255.000, 271.000), stdev = 10.977
CI (99.9%): [1275.000, 1275.000] (assumes normal distribution)
# JMH version: 1.26
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=65146:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 1 iterations, 10 s each
# Measurement: 5 iterations, 30 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream
# Run progress: 66.67% complete, ETA 00:02:44
# Fork: 1 of 1
# Warmup Iteration 1: 483.366 us/op
Iteration 1: 408.351 us/op
·gc.alloc.rate: 3580.758 MB/sec
·gc.alloc.rate.norm: 1558808.007 B/op
·gc.churn.PS_Eden_Space: 3586.078 MB/sec
·gc.churn.PS_Eden_Space.norm: 1561123.846 B/op
·gc.churn.PS_Survivor_Space: 0.511 MB/sec
·gc.churn.PS_Survivor_Space.norm: 222.603 B/op
·gc.count: 476.000 counts
·gc.time: 253.000 ms
Iteration 2: 410.342 us/op
·gc.alloc.rate: 3563.256 MB/sec
·gc.alloc.rate.norm: 1558808.009 B/op
·gc.churn.PS_Eden_Space: 3569.765 MB/sec
·gc.churn.PS_Eden_Space.norm: 1561655.686 B/op
·gc.churn.PS_Survivor_Space: 0.451 MB/sec
·gc.churn.PS_Survivor_Space.norm: 197.394 B/op
·gc.count: 409.000 counts
·gc.time: 244.000 ms
Iteration 3: 407.314 us/op
·gc.alloc.rate: 3589.291 MB/sec
·gc.alloc.rate.norm: 1558808.006 B/op
·gc.churn.PS_Eden_Space: 3592.335 MB/sec
·gc.churn.PS_Eden_Space.norm: 1560130.076 B/op
·gc.churn.PS_Survivor_Space: 0.557 MB/sec
·gc.churn.PS_Survivor_Space.norm: 241.833 B/op
·gc.count: 495.000 counts
·gc.time: 261.000 ms
Iteration 4: 407.294 us/op
·gc.alloc.rate: 3590.035 MB/sec
·gc.alloc.rate.norm: 1558808.006 B/op
·gc.churn.PS_Eden_Space: 3595.643 MB/sec
·gc.churn.PS_Eden_Space.norm: 1561243.143 B/op
·gc.churn.PS_Survivor_Space: 0.439 MB/sec
·gc.churn.PS_Survivor_Space.norm: 190.513 B/op
·gc.count: 382.000 counts
·gc.time: 239.000 ms
Iteration 5: 410.068 us/op
·gc.alloc.rate: 3565.783 MB/sec
·gc.alloc.rate.norm: 1558808.006 B/op
·gc.churn.PS_Eden_Space: 3576.571 MB/sec
·gc.churn.PS_Eden_Space.norm: 1563524.046 B/op
·gc.churn.PS_Survivor_Space: 0.542 MB/sec
·gc.churn.PS_Survivor_Space.norm: 236.741 B/op
·gc.count: 460.000 counts
·gc.time: 252.000 ms
Result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream":
408.674 ±(99.9%) 5.641 us/op [Average]
(min, avg, max) = (407.294, 408.674, 410.342), stdev = 1.465
CI (99.9%): [403.033, 414.314] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.alloc.rate":
3577.824 ±(99.9%) 48.952 MB/sec [Average]
(min, avg, max) = (3563.256, 3577.824, 3590.035), stdev = 12.713
CI (99.9%): [3528.873, 3626.776] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.alloc.rate.norm":
1558808.007 ±(99.9%) 0.005 B/op [Average]
(min, avg, max) = (1558808.006, 1558808.007, 1558808.009), stdev = 0.001
CI (99.9%): [1558808.002, 1558808.011] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Eden_Space":
3584.078 ±(99.9%) 41.614 MB/sec [Average]
(min, avg, max) = (3569.765, 3584.078, 3595.643), stdev = 10.807
CI (99.9%): [3542.465, 3625.692] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Eden_Space.norm":
1561535.360 ±(99.9%) 4793.590 B/op [Average]
(min, avg, max) = (1560130.076, 1561535.360, 1563524.046), stdev = 1244.880
CI (99.9%): [1556741.769, 1566328.950] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Survivor_Space":
0.500 ±(99.9%) 0.204 MB/sec [Average]
(min, avg, max) = (0.439, 0.500, 0.557), stdev = 0.053
CI (99.9%): [0.296, 0.704] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Survivor_Space.norm":
217.817 ±(99.9%) 88.656 B/op [Average]
(min, avg, max) = (190.513, 217.817, 241.833), stdev = 23.024
CI (99.9%): [129.161, 306.473] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.count":
2222.000 ±(99.9%) 0.001 counts [Sum]
(min, avg, max) = (382.000, 444.400, 495.000), stdev = 47.300
CI (99.9%): [2222.000, 2222.000] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.time":
1249.000 ±(99.9%) 0.001 ms [Sum]
(min, avg, max) = (239.000, 249.800, 261.000), stdev = 8.526
CI (99.9%): [1249.000, 1249.000] (assumes normal distribution)
# Run complete. Total time: 00:08:12
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
BenchmarkDataTableSerialization.preAllocateByteArrayNative avgt 5 523.353 ± 10.345 us/op
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.alloc.rate avgt 5 3246.751 ± 64.066 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.alloc.rate.norm avgt 5 1811608.009 ± 0.005 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Eden_Space avgt 5 3250.704 ± 66.578 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Eden_Space.norm avgt 5 1813811.924 ± 2365.646 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Survivor_Space avgt 5 0.540 ± 0.150 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Survivor_Space.norm avgt 5 301.285 ± 80.118 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.count avgt 5 2528.000 counts
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.time avgt 5 1303.000 ms
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache avgt 5 373.201 ± 8.639 us/op
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.alloc.rate avgt 5 3547.828 ± 82.702 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.alloc.rate.norm avgt 5 1411608.006 ± 0.005 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Eden_Space avgt 5 3555.252 ± 83.120 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Eden_Space.norm avgt 5 1414562.956 ± 7771.212 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Survivor_Space avgt 5 0.551 ± 0.162 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Survivor_Space.norm avgt 5 219.217 ± 60.046 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.count avgt 5 2384.000 counts
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.time avgt 5 1275.000 ms
BenchmarkDataTableSerialization.temporaryOutputStream avgt 5 408.674 ± 5.641 us/op
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.alloc.rate avgt 5 3577.824 ± 48.952 MB/sec
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.alloc.rate.norm avgt 5 1558808.007 ± 0.005 B/op
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Eden_Space avgt 5 3584.078 ± 41.614 MB/sec
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Eden_Space.norm avgt 5 1561535.360 ± 4793.590 B/op
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Survivor_Space avgt 5 0.500 ± 0.204 MB/sec
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Survivor_Space.norm avgt 5 217.817 ± 88.656 B/op
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.count avgt 5 2222.000 counts
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.time avgt 5 1249.000 ms
Process finished with exit code 0
```
If my implementation is correct, benchmark result shows that using pre-allocate byte array with cache is slightly better that temporary output stream (10% faster -- 373.201 us/op VS. 408.674 us/op, use more memory of course to cache encoded KV, but GC time does not increased -- 1275ms VS 1249ms). It's easy to understand why `preAllocateByteArrayNative` is the worst one -- it encode K/V twice, whereas other two methods only encode K/V once.
Not sure whether we should do the change just for getting 10% improvement.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [incubator-pinot] mqliang edited a comment on issue #6714: Benchmark data table serialization logic and pre-allocate byte[] array if need be
Posted by GitBox <gi...@apache.org>.
mqliang edited a comment on issue #6714:
URL: https://github.com/apache/incubator-pinot/issues/6714#issuecomment-806403959
I write a benchmark here: https://github.com/mqliang/incubator-pinot/commit/7892423579b20dafcb5802a09f20f826377f6c39
The benchmark compares three serialization methods (serialize a typical metadata map):
* `temporaryOutputStream`: For each KV pair in metadata, first writes to a temporary output stream and then converts to byte array which is returned to the caller and written to the main stream
* `preAllocateByteArrayNative`:
* loop to go over each entry and keep a running sum of size
* At the end of loop, allocate byte array of that size
* Start another loop and go over each entry again and fill out the pre-allocated byte array.
* Return the filled byte array
* key and values are encoded two times during two loop
* `preAllocateByteArrayWithBytesCache`: same logic as `preAllocateByteArrayNative`, just add a cache to cache the encoded K/V so can be used in the second loop.
Here is the result:
```
# JMH version: 1.26
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=65146:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 1 iterations, 10 s each
# Measurement: 5 iterations, 30 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative
# Run progress: 0.00% complete, ETA 00:08:00
# Fork: 1 of 1
# Warmup Iteration 1: 552.178 us/op
Iteration 1: 519.531 us/op
·gc.alloc.rate: 3270.480 MB/sec
·gc.alloc.rate.norm: 1811608.009 B/op
·gc.churn.PS_Eden_Space: 3275.114 MB/sec
·gc.churn.PS_Eden_Space.norm: 1814175.318 B/op
·gc.churn.PS_Survivor_Space: 0.558 MB/sec
·gc.churn.PS_Survivor_Space.norm: 309.168 B/op
·gc.count: 525.000 counts
·gc.time: 261.000 ms
Iteration 2: 524.659 us/op
·gc.alloc.rate: 3238.871 MB/sec
·gc.alloc.rate.norm: 1811608.011 B/op
·gc.churn.PS_Eden_Space: 3242.901 MB/sec
·gc.churn.PS_Eden_Space.norm: 1813862.347 B/op
·gc.churn.PS_Survivor_Space: 0.563 MB/sec
·gc.churn.PS_Survivor_Space.norm: 314.968 B/op
·gc.count: 516.000 counts
·gc.time: 263.000 ms
Iteration 3: 526.323 us/op
·gc.alloc.rate: 3228.230 MB/sec
·gc.alloc.rate.norm: 1811608.008 B/op
·gc.churn.PS_Eden_Space: 3232.024 MB/sec
·gc.churn.PS_Eden_Space.norm: 1813736.682 B/op
·gc.churn.PS_Survivor_Space: 0.471 MB/sec
·gc.churn.PS_Survivor_Space.norm: 264.539 B/op
·gc.count: 470.000 counts
·gc.time: 254.000 ms
Iteration 4: 521.779 us/op
·gc.alloc.rate: 3256.320 MB/sec
·gc.alloc.rate.norm: 1811608.008 B/op
·gc.churn.PS_Eden_Space: 3261.433 MB/sec
·gc.churn.PS_Eden_Space.norm: 1814452.617 B/op
·gc.churn.PS_Survivor_Space: 0.560 MB/sec
·gc.churn.PS_Survivor_Space.norm: 311.772 B/op
·gc.count: 534.000 counts
·gc.time: 270.000 ms
Iteration 5: 524.474 us/op
·gc.alloc.rate: 3239.855 MB/sec
·gc.alloc.rate.norm: 1811608.008 B/op
·gc.churn.PS_Eden_Space: 3242.045 MB/sec
·gc.churn.PS_Eden_Space.norm: 1812832.659 B/op
·gc.churn.PS_Survivor_Space: 0.547 MB/sec
·gc.churn.PS_Survivor_Space.norm: 305.975 B/op
·gc.count: 483.000 counts
·gc.time: 255.000 ms
Result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative":
523.353 ±(99.9%) 10.345 us/op [Average]
(min, avg, max) = (519.531, 523.353, 526.323), stdev = 2.687
CI (99.9%): [513.008, 533.698] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.alloc.rate":
3246.751 ±(99.9%) 64.066 MB/sec [Average]
(min, avg, max) = (3228.230, 3246.751, 3270.480), stdev = 16.638
CI (99.9%): [3182.685, 3310.818] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.alloc.rate.norm":
1811608.009 ±(99.9%) 0.005 B/op [Average]
(min, avg, max) = (1811608.008, 1811608.009, 1811608.011), stdev = 0.001
CI (99.9%): [1811608.003, 1811608.014] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Eden_Space":
3250.704 ±(99.9%) 66.578 MB/sec [Average]
(min, avg, max) = (3232.024, 3250.704, 3275.114), stdev = 17.290
CI (99.9%): [3184.126, 3317.282] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Eden_Space.norm":
1813811.924 ±(99.9%) 2365.646 B/op [Average]
(min, avg, max) = (1812832.659, 1813811.924, 1814452.617), stdev = 614.351
CI (99.9%): [1811446.279, 1816177.570] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Survivor_Space":
0.540 ±(99.9%) 0.150 MB/sec [Average]
(min, avg, max) = (0.471, 0.540, 0.563), stdev = 0.039
CI (99.9%): [0.390, 0.690] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Survivor_Space.norm":
301.285 ±(99.9%) 80.118 B/op [Average]
(min, avg, max) = (264.539, 301.285, 314.968), stdev = 20.806
CI (99.9%): [221.166, 381.403] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.count":
2528.000 ±(99.9%) 0.001 counts [Sum]
(min, avg, max) = (470.000, 505.600, 534.000), stdev = 27.700
CI (99.9%): [2528.000, 2528.000] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.time":
1303.000 ±(99.9%) 0.001 ms [Sum]
(min, avg, max) = (254.000, 260.600, 270.000), stdev = 6.504
CI (99.9%): [1303.000, 1303.000] (assumes normal distribution)
# JMH version: 1.26
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=65146:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 1 iterations, 10 s each
# Measurement: 5 iterations, 30 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache
# Run progress: 33.33% complete, ETA 00:05:28
# Fork: 1 of 1
# Warmup Iteration 1: 390.616 us/op
Iteration 1: 375.676 us/op
·gc.alloc.rate: 3524.091 MB/sec
·gc.alloc.rate.norm: 1411608.008 B/op
·gc.churn.PS_Eden_Space: 3532.601 MB/sec
·gc.churn.PS_Eden_Space.norm: 1415016.587 B/op
·gc.churn.PS_Survivor_Space: 0.538 MB/sec
·gc.churn.PS_Survivor_Space.norm: 215.400 B/op
·gc.count: 458.000 counts
·gc.time: 248.000 ms
Iteration 2: 375.171 us/op
·gc.alloc.rate: 3528.907 MB/sec
·gc.alloc.rate.norm: 1411608.006 B/op
·gc.churn.PS_Eden_Space: 3534.356 MB/sec
·gc.churn.PS_Eden_Space.norm: 1413787.624 B/op
·gc.churn.PS_Survivor_Space: 0.494 MB/sec
·gc.churn.PS_Survivor_Space.norm: 197.609 B/op
·gc.count: 435.000 counts
·gc.time: 247.000 ms
Iteration 3: 373.233 us/op
·gc.alloc.rate: 3547.720 MB/sec
·gc.alloc.rate.norm: 1411608.005 B/op
·gc.churn.PS_Eden_Space: 3559.728 MB/sec
·gc.churn.PS_Eden_Space.norm: 1416385.929 B/op
·gc.churn.PS_Survivor_Space: 0.539 MB/sec
·gc.churn.PS_Survivor_Space.norm: 214.343 B/op
·gc.count: 462.000 counts
·gc.time: 247.000 ms
Iteration 4: 371.186 us/op
·gc.alloc.rate: 3567.068 MB/sec
·gc.alloc.rate.norm: 1411608.006 B/op
·gc.churn.PS_Eden_Space: 3566.702 MB/sec
·gc.churn.PS_Eden_Space.norm: 1411463.405 B/op
·gc.churn.PS_Survivor_Space: 0.597 MB/sec
·gc.churn.PS_Survivor_Space.norm: 236.411 B/op
·gc.count: 520.000 counts
·gc.time: 271.000 ms
Iteration 5: 370.738 us/op
·gc.alloc.rate: 3571.354 MB/sec
·gc.alloc.rate.norm: 1411608.005 B/op
·gc.churn.PS_Eden_Space: 3582.874 MB/sec
·gc.churn.PS_Eden_Space.norm: 1416161.234 B/op
·gc.churn.PS_Survivor_Space: 0.588 MB/sec
·gc.churn.PS_Survivor_Space.norm: 232.322 B/op
·gc.count: 509.000 counts
·gc.time: 262.000 ms
Result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache":
373.201 ±(99.9%) 8.639 us/op [Average]
(min, avg, max) = (370.738, 373.201, 375.676), stdev = 2.243
CI (99.9%): [364.562, 381.840] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.alloc.rate":
3547.828 ±(99.9%) 82.702 MB/sec [Average]
(min, avg, max) = (3524.091, 3547.828, 3571.354), stdev = 21.477
CI (99.9%): [3465.126, 3630.530] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.alloc.rate.norm":
1411608.006 ±(99.9%) 0.005 B/op [Average]
(min, avg, max) = (1411608.005, 1411608.006, 1411608.008), stdev = 0.001
CI (99.9%): [1411608.001, 1411608.011] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Eden_Space":
3555.252 ±(99.9%) 83.120 MB/sec [Average]
(min, avg, max) = (3532.601, 3555.252, 3582.874), stdev = 21.586
CI (99.9%): [3472.132, 3638.373] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Eden_Space.norm":
1414562.956 ±(99.9%) 7771.212 B/op [Average]
(min, avg, max) = (1411463.405, 1414562.956, 1416385.929), stdev = 2018.159
CI (99.9%): [1406791.744, 1422334.168] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Survivor_Space":
0.551 ±(99.9%) 0.162 MB/sec [Average]
(min, avg, max) = (0.494, 0.551, 0.597), stdev = 0.042
CI (99.9%): [0.389, 0.713] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Survivor_Space.norm":
219.217 ±(99.9%) 60.046 B/op [Average]
(min, avg, max) = (197.609, 219.217, 236.411), stdev = 15.594
CI (99.9%): [159.171, 279.263] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.count":
2384.000 ±(99.9%) 0.001 counts [Sum]
(min, avg, max) = (435.000, 476.800, 520.000), stdev = 36.134
CI (99.9%): [2384.000, 2384.000] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.time":
1275.000 ±(99.9%) 0.001 ms [Sum]
(min, avg, max) = (247.000, 255.000, 271.000), stdev = 10.977
CI (99.9%): [1275.000, 1275.000] (assumes normal distribution)
# JMH version: 1.26
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=65146:/Users/mqliang/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/203.7717.56/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 1 iterations, 10 s each
# Measurement: 5 iterations, 30 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream
# Run progress: 66.67% complete, ETA 00:02:44
# Fork: 1 of 1
# Warmup Iteration 1: 483.366 us/op
Iteration 1: 408.351 us/op
·gc.alloc.rate: 3580.758 MB/sec
·gc.alloc.rate.norm: 1558808.007 B/op
·gc.churn.PS_Eden_Space: 3586.078 MB/sec
·gc.churn.PS_Eden_Space.norm: 1561123.846 B/op
·gc.churn.PS_Survivor_Space: 0.511 MB/sec
·gc.churn.PS_Survivor_Space.norm: 222.603 B/op
·gc.count: 476.000 counts
·gc.time: 253.000 ms
Iteration 2: 410.342 us/op
·gc.alloc.rate: 3563.256 MB/sec
·gc.alloc.rate.norm: 1558808.009 B/op
·gc.churn.PS_Eden_Space: 3569.765 MB/sec
·gc.churn.PS_Eden_Space.norm: 1561655.686 B/op
·gc.churn.PS_Survivor_Space: 0.451 MB/sec
·gc.churn.PS_Survivor_Space.norm: 197.394 B/op
·gc.count: 409.000 counts
·gc.time: 244.000 ms
Iteration 3: 407.314 us/op
·gc.alloc.rate: 3589.291 MB/sec
·gc.alloc.rate.norm: 1558808.006 B/op
·gc.churn.PS_Eden_Space: 3592.335 MB/sec
·gc.churn.PS_Eden_Space.norm: 1560130.076 B/op
·gc.churn.PS_Survivor_Space: 0.557 MB/sec
·gc.churn.PS_Survivor_Space.norm: 241.833 B/op
·gc.count: 495.000 counts
·gc.time: 261.000 ms
Iteration 4: 407.294 us/op
·gc.alloc.rate: 3590.035 MB/sec
·gc.alloc.rate.norm: 1558808.006 B/op
·gc.churn.PS_Eden_Space: 3595.643 MB/sec
·gc.churn.PS_Eden_Space.norm: 1561243.143 B/op
·gc.churn.PS_Survivor_Space: 0.439 MB/sec
·gc.churn.PS_Survivor_Space.norm: 190.513 B/op
·gc.count: 382.000 counts
·gc.time: 239.000 ms
Iteration 5: 410.068 us/op
·gc.alloc.rate: 3565.783 MB/sec
·gc.alloc.rate.norm: 1558808.006 B/op
·gc.churn.PS_Eden_Space: 3576.571 MB/sec
·gc.churn.PS_Eden_Space.norm: 1563524.046 B/op
·gc.churn.PS_Survivor_Space: 0.542 MB/sec
·gc.churn.PS_Survivor_Space.norm: 236.741 B/op
·gc.count: 460.000 counts
·gc.time: 252.000 ms
Result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream":
408.674 ±(99.9%) 5.641 us/op [Average]
(min, avg, max) = (407.294, 408.674, 410.342), stdev = 1.465
CI (99.9%): [403.033, 414.314] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.alloc.rate":
3577.824 ±(99.9%) 48.952 MB/sec [Average]
(min, avg, max) = (3563.256, 3577.824, 3590.035), stdev = 12.713
CI (99.9%): [3528.873, 3626.776] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.alloc.rate.norm":
1558808.007 ±(99.9%) 0.005 B/op [Average]
(min, avg, max) = (1558808.006, 1558808.007, 1558808.009), stdev = 0.001
CI (99.9%): [1558808.002, 1558808.011] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Eden_Space":
3584.078 ±(99.9%) 41.614 MB/sec [Average]
(min, avg, max) = (3569.765, 3584.078, 3595.643), stdev = 10.807
CI (99.9%): [3542.465, 3625.692] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Eden_Space.norm":
1561535.360 ±(99.9%) 4793.590 B/op [Average]
(min, avg, max) = (1560130.076, 1561535.360, 1563524.046), stdev = 1244.880
CI (99.9%): [1556741.769, 1566328.950] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Survivor_Space":
0.500 ±(99.9%) 0.204 MB/sec [Average]
(min, avg, max) = (0.439, 0.500, 0.557), stdev = 0.053
CI (99.9%): [0.296, 0.704] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Survivor_Space.norm":
217.817 ±(99.9%) 88.656 B/op [Average]
(min, avg, max) = (190.513, 217.817, 241.833), stdev = 23.024
CI (99.9%): [129.161, 306.473] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.count":
2222.000 ±(99.9%) 0.001 counts [Sum]
(min, avg, max) = (382.000, 444.400, 495.000), stdev = 47.300
CI (99.9%): [2222.000, 2222.000] (assumes normal distribution)
Secondary result "org.apache.pinot.perf.BenchmarkDataTableSerialization.temporaryOutputStream:·gc.time":
1249.000 ±(99.9%) 0.001 ms [Sum]
(min, avg, max) = (239.000, 249.800, 261.000), stdev = 8.526
CI (99.9%): [1249.000, 1249.000] (assumes normal distribution)
# Run complete. Total time: 00:08:12
REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark Mode Cnt Score Error Units
BenchmarkDataTableSerialization.preAllocateByteArrayNative avgt 5 523.353 ± 10.345 us/op
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.alloc.rate avgt 5 3246.751 ± 64.066 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.alloc.rate.norm avgt 5 1811608.009 ± 0.005 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Eden_Space avgt 5 3250.704 ± 66.578 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Eden_Space.norm avgt 5 1813811.924 ± 2365.646 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Survivor_Space avgt 5 0.540 ± 0.150 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.churn.PS_Survivor_Space.norm avgt 5 301.285 ± 80.118 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.count avgt 5 2528.000 counts
BenchmarkDataTableSerialization.preAllocateByteArrayNative:·gc.time avgt 5 1303.000 ms
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache avgt 5 373.201 ± 8.639 us/op
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.alloc.rate avgt 5 3547.828 ± 82.702 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.alloc.rate.norm avgt 5 1411608.006 ± 0.005 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Eden_Space avgt 5 3555.252 ± 83.120 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Eden_Space.norm avgt 5 1414562.956 ± 7771.212 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Survivor_Space avgt 5 0.551 ± 0.162 MB/sec
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.churn.PS_Survivor_Space.norm avgt 5 219.217 ± 60.046 B/op
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.count avgt 5 2384.000 counts
BenchmarkDataTableSerialization.preAllocateByteArrayWithBytesCache:·gc.time avgt 5 1275.000 ms
BenchmarkDataTableSerialization.temporaryOutputStream avgt 5 408.674 ± 5.641 us/op
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.alloc.rate avgt 5 3577.824 ± 48.952 MB/sec
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.alloc.rate.norm avgt 5 1558808.007 ± 0.005 B/op
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Eden_Space avgt 5 3584.078 ± 41.614 MB/sec
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Eden_Space.norm avgt 5 1561535.360 ± 4793.590 B/op
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Survivor_Space avgt 5 0.500 ± 0.204 MB/sec
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.churn.PS_Survivor_Space.norm avgt 5 217.817 ± 88.656 B/op
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.count avgt 5 2222.000 counts
BenchmarkDataTableSerialization.temporaryOutputStream:·gc.time avgt 5 1249.000 ms
Process finished with exit code 0
```
If my implementation is correct, benchmark result shows that using pre-allocate byte array with cache is slightly better than temporary output stream (10% faster -- 373.201 us/op VS. 408.674 us/op, use more memory of course to cache encoded KV, but GC time does not increased -- 1275ms VS 1249ms). It's easy to understand why `preAllocateByteArrayNative` is the worst one -- it encode K/V twice, whereas other two methods only encode K/V once.
Not sure whether we should do the change just for getting 10% improvement.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org