You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/07 20:38:47 UTC
[GitHub] [arrow] nirandaperera opened a new pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
nirandaperera opened a new pull request #10679:
URL: https://github.com/apache/arrow/pull/10679
This PR adds the changes discussed in ARROW-13170.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-876226830
I guess branch prediction works really well on modern processors — when the filter values are mostly false then the if block is rarely executed. In the 50% selected case, branch prediction doesn't help so this method yields speedups.
Not sure if it's worth investigating, but moving `out_position_` from a class member to a stack variable could have some performance impact. Another thought is removing some of the offset arithmetic
Lastly, since we're using `SetBitTo` a lot, it might be worth checking the superscalar variant described at
https://graphics.stanford.edu/~seander/bithacks.html#ConditionalSetOrClearBitsWithoutBranching
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-884312893
I get these benchmark results here (AMD Zen 2 CPU):
```
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Non-regressions: (20)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
FilterInt64FilterWithNulls/524288/10 1.656 GiB/sec 3.296 GiB/sec 99.035 {'run_name': 'FilterInt64FilterWithNulls/524288/10', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2369, 'data null%': 10.0, 'mask null%': 5.0, 'select%': 50.0}
FilterInt64FilterWithNulls/524288/7 1.746 GiB/sec 3.446 GiB/sec 97.382 {'run_name': 'FilterInt64FilterWithNulls/524288/7', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2488, 'data null%': 1.0, 'mask null%': 5.0, 'select%': 50.0}
FilterInt64FilterWithNulls/524288/13 1.658 GiB/sec 3.255 GiB/sec 96.242 {'run_name': 'FilterInt64FilterWithNulls/524288/13', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2375, 'data null%': 90.0, 'mask null%': 5.0, 'select%': 50.0}
FilterInt64FilterWithNulls/524288/1 1.829 GiB/sec 3.570 GiB/sec 95.193 {'run_name': 'FilterInt64FilterWithNulls/524288/1', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2608, 'data null%': 0.0, 'mask null%': 5.0, 'select%': 50.0}
FilterInt64FilterWithNulls/524288/4 1.834 GiB/sec 3.556 GiB/sec 93.834 {'run_name': 'FilterInt64FilterWithNulls/524288/4', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2620, 'data null%': 0.1, 'mask null%': 5.0, 'select%': 50.0}
FilterInt64FilterNoNulls/524288/7 1.790 GiB/sec 3.102 GiB/sec 73.262 {'run_name': 'FilterInt64FilterNoNulls/524288/7', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2563, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/524288/4 1.865 GiB/sec 2.713 GiB/sec 45.508 {'run_name': 'FilterInt64FilterNoNulls/524288/4', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2666, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/524288/13 1.741 GiB/sec 2.500 GiB/sec 43.569 {'run_name': 'FilterInt64FilterNoNulls/524288/13', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2503, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/524288/10 1.747 GiB/sec 2.458 GiB/sec 40.644 {'run_name': 'FilterInt64FilterNoNulls/524288/10', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2503, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterWithNulls/524288/9 2.782 GiB/sec 3.264 GiB/sec 17.322 {'run_name': 'FilterInt64FilterWithNulls/524288/9', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 3930, 'data null%': 10.0, 'mask null%': 5.0, 'select%': 99.9}
FilterInt64FilterWithNulls/524288/6 3.131 GiB/sec 3.621 GiB/sec 15.660 {'run_name': 'FilterInt64FilterWithNulls/524288/6', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4484, 'data null%': 1.0, 'mask null%': 5.0, 'select%': 99.9}
FilterInt64FilterWithNulls/524288/12 2.792 GiB/sec 3.200 GiB/sec 14.600 {'run_name': 'FilterInt64FilterWithNulls/524288/12', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 3997, 'data null%': 90.0, 'mask null%': 5.0, 'select%': 99.9}
FilterInt64FilterNoNulls/524288/12 10.371 GiB/sec 11.380 GiB/sec 9.728 {'run_name': 'FilterInt64FilterNoNulls/524288/12', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14327, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/524288/9 10.397 GiB/sec 11.059 GiB/sec 6.363 {'run_name': 'FilterInt64FilterNoNulls/524288/9', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14889, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/524288/0 31.150 GiB/sec 33.025 GiB/sec 6.021 {'run_name': 'FilterInt64FilterNoNulls/524288/0', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 44550, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/524288/6 12.010 GiB/sec 12.518 GiB/sec 4.228 {'run_name': 'FilterInt64FilterNoNulls/524288/6', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 17148, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/524288/1 3.233 GiB/sec 3.314 GiB/sec 2.478 {'run_name': 'FilterInt64FilterNoNulls/524288/1', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4619, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/524288/2 55.380 GiB/sec 55.998 GiB/sec 1.116 {'run_name': 'FilterInt64FilterNoNulls/524288/2', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 79290, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterWithNulls/524288/3 3.614 GiB/sec 3.621 GiB/sec 0.191 {'run_name': 'FilterInt64FilterWithNulls/524288/3', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5171, 'data null%': 0.1, 'mask null%': 5.0, 'select%': 99.9}
FilterInt64FilterNoNulls/524288/3 16.487 GiB/sec 16.002 GiB/sec -2.945 {'run_name': 'FilterInt64FilterNoNulls/524288/3', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 24008, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 99.9}
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Regressions: (10)
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
FilterInt64FilterWithNulls/524288/0 3.703 GiB/sec 2.591 GiB/sec -30.008 {'run_name': 'FilterInt64FilterWithNulls/524288/0', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5297, 'data null%': 0.0, 'mask null%': 5.0, 'select%': 99.9}
FilterInt64FilterWithNulls/524288/8 9.559 GiB/sec 6.414 GiB/sec -32.902 {'run_name': 'FilterInt64FilterWithNulls/524288/8', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 13527, 'data null%': 1.0, 'mask null%': 5.0, 'select%': 1.0}
FilterInt64FilterWithNulls/524288/14 9.474 GiB/sec 6.310 GiB/sec -33.396 {'run_name': 'FilterInt64FilterWithNulls/524288/14', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 13647, 'data null%': 90.0, 'mask null%': 5.0, 'select%': 1.0}
FilterInt64FilterWithNulls/524288/2 10.062 GiB/sec 6.623 GiB/sec -34.171 {'run_name': 'FilterInt64FilterWithNulls/524288/2', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14448, 'data null%': 0.0, 'mask null%': 5.0, 'select%': 1.0}
FilterInt64FilterWithNulls/524288/11 9.583 GiB/sec 6.202 GiB/sec -35.279 {'run_name': 'FilterInt64FilterWithNulls/524288/11', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 13649, 'data null%': 10.0, 'mask null%': 5.0, 'select%': 1.0}
FilterInt64FilterWithNulls/524288/5 10.277 GiB/sec 6.542 GiB/sec -36.339 {'run_name': 'FilterInt64FilterWithNulls/524288/5', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 14650, 'data null%': 0.1, 'mask null%': 5.0, 'select%': 1.0}
FilterInt64FilterNoNulls/524288/5 12.943 GiB/sec 6.610 GiB/sec -48.933 {'run_name': 'FilterInt64FilterNoNulls/524288/5', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 18519, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/524288/8 12.834 GiB/sec 6.503 GiB/sec -49.332 {'run_name': 'FilterInt64FilterNoNulls/524288/8', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 18022, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/524288/11 12.924 GiB/sec 6.505 GiB/sec -49.671 {'run_name': 'FilterInt64FilterNoNulls/524288/11', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 18604, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/524288/14 12.991 GiB/sec 6.454 GiB/sec -50.321 {'run_name': 'FilterInt64FilterNoNulls/524288/14', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 18607, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 1.0}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-899163047
I'm closing this PR. Feel free to reopen if new findings.
IMO, branchless implementation unconditionally introduces extra instructions. Given predictable branch is free on modern cpu, looks current branch code is optimal for low selection case which is common in practice.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot edited a comment on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-875950075
Benchmark runs are scheduled for baseline = cf6a7ff65f4e2920641d116a3ba1f578b2bd8a9e and contender = 5a14b94046288938b22e1db1e7d0a5f6cc5d2fd1. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2 (mimalloc)](https://conbench.ursa.dev/compare/runs/96acb784c87842b2bfd28e17a9a90432...3465b868d70e425e8af9141550f9450f/)
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x (mimalloc)](https://conbench.ursa.dev/compare/runs/5744b64d14d448198229be1dbb5265e7...87610e39461a4309accf9c5ce9f7f2f9/)
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ursa-thinkcentre-m75q (mimalloc)](https://conbench.ursa.dev/compare/runs/cd0a0e80ad2c4de0b60cda38c58b64a4...ec5b59d942eb4e7794b7c7808006e223/)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
ursabot commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-879314016
Benchmark runs are scheduled for baseline = cf6a7ff65f4e2920641d116a3ba1f578b2bd8a9e and contender = 38110e8e7ee598ddb0e8a3465d81ea7e24bafebc. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped :warning: Provided benchmark filters do not have any benchmark groups to be executed on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2 (mimalloc)](https://conbench.ursa.dev/compare/runs/96acb784c87842b2bfd28e17a9a90432...fdfaab8d2dc84fbd81318b663afe1e44/)
[Skipped :warning: Only ['Python', 'R'] langs are supported on ursa-i9-9960x] [ursa-i9-9960x (mimalloc)](https://conbench.ursa.dev/compare/runs/5744b64d14d448198229be1dbb5265e7...a08372ce96df46c48f6874d84c74035c/)
[Scheduled] [ursa-thinkcentre-m75q (mimalloc)](https://conbench.ursa.dev/compare/runs/cd0a0e80ad2c4de0b60cda38c58b64a4...9d55178f98434a64b8398f25458f5408/)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot edited a comment on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-875917401
Benchmark runs are scheduled for baseline = 6c8d30ea82222fd2750b999840872d3f6cbdc8f8 and contender = c2d694b17596b13007bd54804d382808c60066aa. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Failed] [ec2-t3-xlarge-us-east-2 (mimalloc)](https://conbench.ursa.dev/compare/runs/ead005de138847cdb35b900308a8716e...91280e77e83149fb857bf3153ece3a4e/)
[Failed] [ursa-i9-9960x (mimalloc)](https://conbench.ursa.dev/compare/runs/5d26982bd71e45878dd5387f80c8d0f4...9475f85ca9e14856868ea301f4c6e7ea/)
[Failed] [ursa-thinkcentre-m75q (mimalloc)](https://conbench.ursa.dev/compare/runs/00761e2d135d41859abe0f44fca6ee61...d7af8eaebb1149308ccf7b8870fe84c9/)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-879313446
@ursabot please benchmark lang=C++
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-887987917
> It seems like breaking out the super-scaler variant of SetBitTo out into a separate PR would be a good thing regardless
I will open a separate JIRA for this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot edited a comment on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-875950075
Benchmark runs are scheduled for baseline = cf6a7ff65f4e2920641d116a3ba1f578b2bd8a9e and contender = 5a14b94046288938b22e1db1e7d0a5f6cc5d2fd1. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2 (mimalloc)](https://conbench.ursa.dev/compare/runs/96acb784c87842b2bfd28e17a9a90432...3465b868d70e425e8af9141550f9450f/)
[Scheduled] [ursa-i9-9960x (mimalloc)](https://conbench.ursa.dev/compare/runs/5744b64d14d448198229be1dbb5265e7...87610e39461a4309accf9c5ce9f7f2f9/)
[Scheduled] [ursa-thinkcentre-m75q (mimalloc)](https://conbench.ursa.dev/compare/runs/cd0a0e80ad2c4de0b60cda38c58b64a4...ec5b59d942eb4e7794b7c7808006e223/)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-875916614
https://issues.apache.org/jira/browse/ARROW-13170
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-887773427
It seems like breaking out the super-scaler variant of SetBitTo out into a separate PR would be a good thing regardless
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-876164351
Thanks @nirandaperera for doing this!
There are big improvement for 50% selection case, when cpu branch prediction works worst. This is great.
I'm a bit concerned about the _big drop of 1% selection_ case (looks useful in real world IMO).
I tested on my xeon gold 5218 server and got similar result as yours. To ease debugging, I only listed FilterInt64FilterNoNulls tests.
(NOTE: FilterRecordBatchXXX/100/X tests are pretty noisy per my experience, you may ignore them).
```
$ archery benchmark diff --suite-filter=arrow-compute-vector-selection-benchmark \
--benchmark-filter="^FilterInt64FilterNoNulls" \
--cc=clang-10 --cxx=clang++-10
-----------------------------------------------------------------------------------------------------------------
Non-regressions: (11)
-----------------------------------------------------------------------------------------------------------------
benchmark baseline contender change
// XXX: big improvement for selection = 50%
FilterInt64FilterNoNulls/1048576/4 1.033 GiB/sec 2.123 GiB/sec 105.545 {'data null%': 0.1, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/7 1.031 GiB/sec 1.921 GiB/sec 86.369 {'data null%': 1.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/10 1.055 GiB/sec 1.778 GiB/sec 68.505 {'data null%': 10.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/13 1.054 GiB/sec 1.772 GiB/sec 68.161 {'data null%': 90.0, 'select%': 50.0}
// XXX: no difference for selection = 99.9%
FilterInt64FilterNoNulls/1048576/9 5.495 GiB/sec 5.744 GiB/sec 4.530 {'data null%': 10.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/12 5.572 GiB/sec 5.693 GiB/sec 2.176 {'data null%': 90.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/3 8.387 GiB/sec 8.431 GiB/sec 0.521 {'data null%': 0.1, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/0 12.422 GiB/sec 12.417 GiB/sec -0.040 {'data null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/6 6.787 GiB/sec 6.717 GiB/sec -1.030 {'data null%': 1.0, 'select%': 99.9}
// XXX: no difference if no nulls, regardless selection
FilterInt64FilterNoNulls/1048576/1 1.927 GiB/sec 1.955 GiB/sec 1.470 {'data null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/2 31.374 GiB/sec 31.808 GiB/sec 1.383 {'data null%': 0.0, 'select%': 1.0}
-----------------------------------------------------------------------------------------------------------------
Regressions: (4)
-----------------------------------------------------------------------------------------------------------------
benchmark baseline contender change
// XXX: big regression for selection = 1%
FilterInt64FilterNoNulls/1048576/5 6.755 GiB/sec 3.766 GiB/sec -44.239 {'data null%': 0.1, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/8 6.766 GiB/sec 3.500 GiB/sec -48.265 {'data null%': 1.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/14 7.182 GiB/sec 3.271 GiB/sec -54.453 {'data null%': 90.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/11 7.183 GiB/sec 3.271 GiB/sec -54.470 {'data null%': 10.0, 'select%': 1.0}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-884323959
> I'm not fond of this PR. The fact that the results are rather mixed while it adds significant complexity to the implementation doesn't make it extremely desirable IMHO.
I agree. I also have a problem with these regressions. I am planning to add some SIMD stuff into this, to see if we could get a better outcome in the low selectivity cases.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
cyb70289 commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-880342643
> @wesm @cyb70289 @bkietz Is there anything else we could do for the low selectivity cases (1% select)?
I don't have satisfying suggestions.
A possible workaround I guess is to choose branch/non-branch code per selectivity. Smells like "benchmark oriented optimization"?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot edited a comment on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-879314016
Benchmark runs are scheduled for baseline = cf6a7ff65f4e2920641d116a3ba1f578b2bd8a9e and contender = 38110e8e7ee598ddb0e8a3465d81ea7e24bafebc. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped :warning: Provided benchmark filters do not have any benchmark groups to be executed on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2 (mimalloc)](https://conbench.ursa.dev/compare/runs/96acb784c87842b2bfd28e17a9a90432...fdfaab8d2dc84fbd81318b663afe1e44/)
[Skipped :warning: Only ['Python', 'R'] langs are supported on ursa-i9-9960x] [ursa-i9-9960x (mimalloc)](https://conbench.ursa.dev/compare/runs/5744b64d14d448198229be1dbb5265e7...a08372ce96df46c48f6874d84c74035c/)
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ursa-thinkcentre-m75q (mimalloc)](https://conbench.ursa.dev/compare/runs/cd0a0e80ad2c4de0b60cda38c58b64a4...9d55178f98434a64b8398f25458f5408/)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-876664929
@wesm with super scalar variant, I get the following,
```
BEFORE:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Non-regressions: (10)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
FilterInt64FilterNoNulls/1048576/4 1.423 GiB/sec 2.658 GiB/sec 86.837 {'run_name': 'FilterInt64FilterNoNulls/1048576/4', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1021, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/7 1.367 GiB/sec 2.398 GiB/sec 75.409 {'run_name': 'FilterInt64FilterNoNulls/1048576/7', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 978, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/13 1.318 GiB/sec 2.223 GiB/sec 68.611 {'run_name': 'FilterInt64FilterNoNulls/1048576/13', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 936, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/10 1.323 GiB/sec 2.220 GiB/sec 67.824 {'run_name': 'FilterInt64FilterNoNulls/1048576/10', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 950, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/12 6.890 GiB/sec 7.092 GiB/sec 2.938{'run_name': 'FilterInt64FilterNoNulls/1048576/12', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4949, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/1 2.358 GiB/sec 2.386 GiB/sec 1.166 {'run_name': 'FilterInt64FilterNoNulls/1048576/1', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1676, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/6 8.110 GiB/sec 8.034 GiB/sec -0.940 {'run_name': 'FilterInt64FilterNoNulls/1048576/6', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5779, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/2 39.687 GiB/sec 39.170 GiB/sec -1.301 {'run_name': 'FilterInt64FilterNoNulls/1048576/2', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 28002, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/0 14.103 GiB/sec 13.782 GiB/sec -2.278 {'run_name': 'FilterInt64FilterNoNulls/1048576/0', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 9845, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/9 7.006 GiB/sec 6.800 GiB/sec -2.951 {'run_name': 'FilterInt64FilterNoNulls/1048576/9', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5061, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 99.9}
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Regressions: (5)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
FilterInt64FilterNoNulls/1048576/3 9.850 GiB/sec 9.107 GiB/sec -7.538 {'run_name': 'FilterInt64FilterNoNulls/1048576/3', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 7039, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/5 10.072 GiB/sec 4.863 GiB/sec -51.715 {'run_name': 'FilterInt64FilterNoNulls/1048576/5', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 7212, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/8 9.578 GiB/sec 4.409 GiB/sec -53.966 {'run_name': 'FilterInt64FilterNoNulls/1048576/8', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6873, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/11 9.509 GiB/sec 4.080 GiB/sec -57.096 {'run_name': 'FilterInt64FilterNoNulls/1048576/11', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6842, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/14 9.532 GiB/sec 4.084 GiB/sec -57.154 {'run_name': 'FilterInt64FilterNoNulls/1048576/14', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6748, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 1.0}
AFTER:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Non-regressions: (9)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
FilterInt64FilterNoNulls/1048576/4 1.422 GiB/sec 2.976 GiB/sec 109.349 {'run_name': 'FilterInt64FilterNoNulls/1048576/4', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1017, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/7 1.363 GiB/sec 2.500 GiB/sec 83.386 {'run_name': 'FilterInt64FilterNoNulls/1048576/7', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 978, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/13 1.320 GiB/sec 2.216 GiB/sec 67.878 {'run_name': 'FilterInt64FilterNoNulls/1048576/13', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 945, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/10 1.322 GiB/sec 2.201 GiB/sec 66.533 {'run_name': 'FilterInt64FilterNoNulls/1048576/10', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 947, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/1 2.351 GiB/sec 2.627 GiB/sec 11.739 {'run_name': 'FilterInt64FilterNoNulls/1048576/1', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1683, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/0 13.413 GiB/sec 13.631 GiB/sec 1.628 {'run_name': 'FilterInt64FilterNoNulls/1048576/0', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 9605, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/6 7.672 GiB/sec 7.593 GiB/sec -1.027 {'run_name': 'FilterInt64FilterNoNulls/1048576/6', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5715, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/2 39.856 GiB/sec 38.819 GiB/sec -2.600 {'run_name': 'FilterInt64FilterNoNulls/1048576/2', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 28575, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/12 6.806 GiB/sec 6.558 GiB/sec -3.653 {'run_name': 'FilterInt64FilterNoNulls/1048576/12', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4785, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 99.9}
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Regressions: (6)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
FilterInt64FilterNoNulls/1048576/9 6.843 GiB/sec 6.472 GiB/sec -5.426 {'run_name': 'FilterInt64FilterNoNulls/1048576/9', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4689, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/3 9.378 GiB/sec 7.717 GiB/sec -17.710 {'run_name': 'FilterInt64FilterNoNulls/1048576/3', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6703, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/5 10.064 GiB/sec 5.390 GiB/sec -46.442 {'run_name': 'FilterInt64FilterNoNulls/1048576/5', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 7216, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/8 9.600 GiB/sec 4.489 GiB/sec -53.233 {'run_name': 'FilterInt64FilterNoNulls/1048576/8', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6865, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/11 9.529 GiB/sec 4.105 GiB/sec -56.926 {'run_name': 'FilterInt64FilterNoNulls/1048576/11', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6825, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/14 9.537 GiB/sec 4.098 GiB/sec -57.027 {'run_name': 'FilterInt64FilterNoNulls/1048576/14', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6825, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 1.0}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-878574623
@wesm @cyb70289 @bkietz Is there anything else we could do for the low selectivity cases (1% select)?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot edited a comment on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-875950075
Benchmark runs are scheduled for baseline = cf6a7ff65f4e2920641d116a3ba1f578b2bd8a9e and contender = 5a14b94046288938b22e1db1e7d0a5f6cc5d2fd1. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2 (mimalloc)](https://conbench.ursa.dev/compare/runs/96acb784c87842b2bfd28e17a9a90432...3465b868d70e425e8af9141550f9450f/)
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x (mimalloc)](https://conbench.ursa.dev/compare/runs/5744b64d14d448198229be1dbb5265e7...87610e39461a4309accf9c5ce9f7f2f9/)
[Scheduled] [ursa-thinkcentre-m75q (mimalloc)](https://conbench.ursa.dev/compare/runs/cd0a0e80ad2c4de0b60cda38c58b64a4...ec5b59d942eb4e7794b7c7808006e223/)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
ursabot commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-875917401
Benchmark runs are scheduled for baseline = 6c8d30ea82222fd2750b999840872d3f6cbdc8f8 and contender = c2d694b17596b13007bd54804d382808c60066aa. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Scheduled] [ec2-t3-xlarge-us-east-2 (mimalloc)](https://conbench.ursa.dev/compare/runs/ead005de138847cdb35b900308a8716e...91280e77e83149fb857bf3153ece3a4e/)
[Scheduled] [ursa-i9-9960x (mimalloc)](https://conbench.ursa.dev/compare/runs/5d26982bd71e45878dd5387f80c8d0f4...9475f85ca9e14856868ea301f4c6e7ea/)
[Scheduled] [ursa-thinkcentre-m75q (mimalloc)](https://conbench.ursa.dev/compare/runs/00761e2d135d41859abe0f44fca6ee61...d7af8eaebb1149308ccf7b8870fe84c9/)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-879096754
I believe AVX instructions like mask_store could also help in this use case.
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=mask&cats=Load,Store
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-875917243
@ursabot please benchmark
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] cyb70289 closed pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
cyb70289 closed pull request #10679:
URL: https://github.com/apache/arrow/pull/10679
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-897739852
Should we close this? It was an interesting experiment but doesn't seem to give very convincing results.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
ursabot commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-875950075
Benchmark runs are scheduled for baseline = cf6a7ff65f4e2920641d116a3ba1f578b2bd8a9e and contender = 5a14b94046288938b22e1db1e7d0a5f6cc5d2fd1. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Scheduled] [ec2-t3-xlarge-us-east-2 (mimalloc)](https://conbench.ursa.dev/compare/runs/96acb784c87842b2bfd28e17a9a90432...3465b868d70e425e8af9141550f9450f/)
[Scheduled] [ursa-i9-9960x (mimalloc)](https://conbench.ursa.dev/compare/runs/5744b64d14d448198229be1dbb5265e7...87610e39461a4309accf9c5ce9f7f2f9/)
[Scheduled] [ursa-thinkcentre-m75q (mimalloc)](https://conbench.ursa.dev/compare/runs/cd0a0e80ad2c4de0b60cda38c58b64a4...ec5b59d942eb4e7794b7c7808006e223/)
Supported benchmarks:
ursa-i9-9960x: langs = Python, R
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-884317890
I'm not fond of this PR. The fact that the results are rather mixed while it adds significant complexity to the implementation doesn't make it extremely desirable IMHO.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-897787608
Yes. I think so. I'd like to do some experiments with some AVX
instructions. But I don't think I'd do that immediately.
On Thu, Aug 12, 2021, 11:34 Antoine Pitrou ***@***.***> wrote:
> Should we close this? It was an interesting experiment but doesn't seem to
> give very convincing results.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/apache/arrow/pull/10679#issuecomment-897739852>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABKS65MUVO2ZQHB46JKHWWDT4PSYNANCNFSM477MIY3Q>
> .
>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-875949918
@ursabot please benchmark
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-878573510
I tested with a `VisitWords` impl for this. And it seems to have better results.
```
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Non-regressions: (11)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
FilterInt64FilterNoNulls/1048576/4 1.421 GiB/sec 2.958 GiB/sec 108.167 {'run_name': 'FilterInt64FilterNoNulls/1048576/4', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1019, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/13 1.318 GiB/sec 2.708 GiB/sec 105.439 {'run_name': 'FilterInt64FilterNoNulls/1048576/13', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 945, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/10 1.318 GiB/sec 2.704 GiB/sec 105.207 {'run_name': 'FilterInt64FilterNoNulls/1048576/10', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 944, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/7 1.364 GiB/sec 2.754 GiB/sec 101.937 {'run_name': 'FilterInt64FilterNoNulls/1048576/7', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 977, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/1 2.353 GiB/sec 2.621 GiB/sec 11.389 {'run_name': 'FilterInt64FilterNoNulls/1048576/1', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1689, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 50.0}
FilterInt64FilterNoNulls/1048576/12 6.524 GiB/sec 7.231 GiB/sec 10.830 {'run_name': 'FilterInt64FilterNoNulls/1048576/12', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4783, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/2 39.394 GiB/sec 42.267 GiB/sec 7.293 {'run_name': 'FilterInt64FilterNoNulls/1048576/2', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 28183, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/9 6.925 GiB/sec 7.207 GiB/sec 4.077 {'run_name': 'FilterInt64FilterNoNulls/1048576/9', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 4905, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/6 7.873 GiB/sec 8.009 GiB/sec 1.730 {'run_name': 'FilterInt64FilterNoNulls/1048576/6', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 5643, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/3 9.167 GiB/sec 9.225 GiB/sec 0.637 {'run_name': 'FilterInt64FilterNoNulls/1048576/3', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6530, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 99.9}
FilterInt64FilterNoNulls/1048576/0 13.827 GiB/sec 13.744 GiB/sec -0.597 {'run_name': 'FilterInt64FilterNoNulls/1048576/0', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 9834, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 99.9}
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Regressions: (4)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
benchmark baseline contender change % counters
FilterInt64FilterNoNulls/1048576/5 10.049 GiB/sec 5.493 GiB/sec -45.344 {'run_name': 'FilterInt64FilterNoNulls/1048576/5', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 7199, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/8 9.571 GiB/sec 5.223 GiB/sec -45.423 {'run_name': 'FilterInt64FilterNoNulls/1048576/8', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6869, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/11 9.494 GiB/sec 5.073 GiB/sec -46.560 {'run_name': 'FilterInt64FilterNoNulls/1048576/11', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6836, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 1.0}
FilterInt64FilterNoNulls/1048576/14 9.517 GiB/sec 5.075 GiB/sec -46.674 {'run_name': 'FilterInt64FilterNoNulls/1048576/14', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 6829, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 1.0}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] nirandaperera commented on pull request #10679: ARROW-13170 [C++] Reducing branching in compute/kernels/vector_selection.cc
Posted by GitBox <gi...@apache.org>.
nirandaperera commented on pull request #10679:
URL: https://github.com/apache/arrow/pull/10679#issuecomment-875997414
These were the perf results from my local desktop.
https://gist.github.com/nirandaperera/dfafb77865e948514ca520162be10558
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org