You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/05 13:21:45 UTC
[GitHub] [arrow] Dandandan opened a new pull request #8837: ARROW-10810: [Rust] Speed up comparison kernels
Dandandan opened a new pull request #8837:
URL: https://github.com/apache/arrow/pull/8837
This PR speeds up the (non-simd) comparison kernels by ~8 times:
```
eq Float32 time: [105.96 us 106.01 us 106.06 us]
change: [-82.463% -82.423% -82.378%] (p = 0.00 < 0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low severe
4 (4.00%) high mild
3 (3.00%) high severe
eq scalar Float32 time: [61.439 us 61.530 us 61.662 us]
change: [-88.282% -88.255% -88.221%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low mild
4 (4.00%) high mild
5 (5.00%) high severe
neq Float32 time: [71.018 us 71.080 us 71.144 us]
change: [-86.580% -86.563% -86.546%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
2 (2.00%) low severe
4 (4.00%) low mild
5 (5.00%) high mild
4 (4.00%) high severe
neq scalar Float32 time: [68.706 us 68.773 us 68.838 us]
change: [-86.207% -86.188% -86.171%] (p = 0.00 < 0.05)
Performance has improved.
lt Float32 time: [70.655 us 70.703 us 70.753 us]
change: [-85.629% -85.617% -85.604%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
lt scalar Float32 time: [50.626 us 50.664 us 50.698 us]
change: [-89.802% -89.764% -89.731%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
lt_eq Float32 time: [101.34 us 101.43 us 101.51 us]
change: [-82.825% -82.797% -82.767%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
lt_eq scalar Float32 time: [68.894 us 68.913 us 68.933 us]
change: [-86.575% -86.557% -86.538%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low severe
2 (2.00%) high mild
gt Float32 time: [71.260 us 71.332 us 71.400 us]
change: [-87.481% -87.450% -87.418%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
gt scalar Float32 time: [38.852 us 38.888 us 38.929 us]
change: [-91.745% -91.733% -91.721%] (p = 0.00 < 0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low severe
11 (11.00%) high mild
1 (1.00%) high severe
gt_eq Float32 time: [99.404 us 99.451 us 99.503 us]
change: [-80.870% -80.848% -80.827%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
gt_eq scalar Float32 time: [55.892 us 55.926 us 55.963 us]
change: [-88.783% -88.751% -88.727%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
2 (2.00%) high severe
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan closed pull request #8837: ARROW-10810: [Rust] Speed up comparison kernels
Posted by GitBox <gi...@apache.org>.
Dandandan closed pull request #8837:
URL: https://github.com/apache/arrow/pull/8837
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on pull request #8837: ARROW-10810: [Rust] Speed up comparison kernels
Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #8837:
URL: https://github.com/apache/arrow/pull/8837#issuecomment-739450290
Let's close this for now in favor of #8842
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] removed a comment on pull request #8837: ARROW-10810: [Rust] Speed up comparison kernels
Posted by GitBox <gi...@apache.org>.
github-actions[bot] removed a comment on pull request #8837:
URL: https://github.com/apache/arrow/pull/8837#issuecomment-739250784
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
Thanks for opening a pull request!
Could you open an issue for this pull request on JIRA?
https://issues.apache.org/jira/browse/ARROW
Then could you also rename pull request title in the following format?
ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
See also:
* [Other pull requests](https://github.com/apache/arrow/pulls/)
* [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #8837: ARROW-10810: [Rust] Speed up comparison kernels
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8837:
URL: https://github.com/apache/arrow/pull/8837#issuecomment-739251667
https://issues.apache.org/jira/browse/ARROW-10810
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #8837: ARROW-10810: [Rust] Speed up comparison kernels
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #8837:
URL: https://github.com/apache/arrow/pull/8837#discussion_r536902120
##########
File path: rust/arrow/src/compute/kernels/comparison.rs
##########
@@ -68,9 +77,18 @@ macro_rules! compare_op {
macro_rules! compare_op_scalar {
($left: expr, $right:expr, $op:expr) => {{
let null_bit_buffer = $left.data().null_buffer().cloned();
- let mut result = BooleanBufferBuilder::new($left.len());
+ let byte_capacity = bit_util::ceil($left.len(), 8);
+ let actual_capacity = bit_util::round_upto_multiple_of_64(byte_capacity);
+ let mut buffer = MutableBuffer::new(actual_capacity);
+ buffer.resize(byte_capacity);
+
+ let data = buffer.raw_data_mut();
for i in 0..$left.len() {
- result.append($op($left.value(i), $right))?;
+ if $op($left.value(i), $right) {
+ unsafe {
+ bit_util::set_bit_raw(data, i);
+ }
+ }
Review comment:
Yes, would be nice to see if we can get the combination of a clean/safe API and good performance as well.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8837: ARROW-10810: [Rust] Speed up comparison kernels
Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #8837:
URL: https://github.com/apache/arrow/pull/8837#discussion_r536875918
##########
File path: rust/arrow/src/compute/kernels/comparison.rs
##########
@@ -68,9 +77,18 @@ macro_rules! compare_op {
macro_rules! compare_op_scalar {
($left: expr, $right:expr, $op:expr) => {{
let null_bit_buffer = $left.data().null_buffer().cloned();
- let mut result = BooleanBufferBuilder::new($left.len());
+ let byte_capacity = bit_util::ceil($left.len(), 8);
+ let actual_capacity = bit_util::round_upto_multiple_of_64(byte_capacity);
+ let mut buffer = MutableBuffer::new(actual_capacity);
+ buffer.resize(byte_capacity);
+
+ let data = buffer.raw_data_mut();
for i in 0..$left.len() {
- result.append($op($left.value(i), $right))?;
+ if $op($left.value(i), $right) {
+ unsafe {
+ bit_util::set_bit_raw(data, i);
+ }
+ }
Review comment:
A side note: In general, if we are bypassing builders for speed, it can mean that builders are not a good abstraction, as they significantly impact performance.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #8837: ARROW-10810: [Rust] Speed up comparison kernels
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8837:
URL: https://github.com/apache/arrow/pull/8837#issuecomment-739250784
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
Thanks for opening a pull request!
Could you open an issue for this pull request on JIRA?
https://issues.apache.org/jira/browse/ARROW
Then could you also rename pull request title in the following format?
ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
See also:
* [Other pull requests](https://github.com/apache/arrow/pulls/)
* [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org