You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/02 23:36:48 UTC
[GitHub] [arrow-rs] tustvold edited a comment on pull request #1248: POC: Specialized filter kernels
tustvold edited a comment on pull request #1248:
URL: https://github.com/apache/arrow-rs/pull/1248#issuecomment-1028457980
So I've added string and dictionary support. Strings only see a ~2x performance bump, this can almost certainly be pushed further, but I figured it was good enough and I can come back to it in subsequent PRs. Dictionaries see the full 9-10x bump that the integer types see which is :ok_hand:
I'm marking this ready for review because I think it represents a non-trivial improvement that would be super-awesome to make it into the arrow 9 release. There is definitely further refinement that could take place, and more perf that could be squeezed out, but I'd prefer to do that in subsequent PRs unless people object.
For reference here are the latest numbers, down a little bit since #1228 was merged, but still pretty exciting imo
```
filter u8 time: [47.731 us 47.740 us 47.748 us]
change: [-85.383% -85.377% -85.371%] (p = 0.00 < 0.05)
Performance has improved.
filter u8 high selectivity
time: [2.4163 us 2.4185 us 2.4210 us]
change: [-73.969% -73.939% -73.909%] (p = 0.00 < 0.05)
Performance has improved.
filter u8 low selectivity
time: [1.3798 us 1.3814 us 1.3829 us]
change: [-41.966% -41.649% -41.441%] (p = 0.00 < 0.05)
Performance has improved.
filter context u8 time: [14.557 us 14.567 us 14.580 us]
change: [-94.178% -94.175% -94.171%] (p = 0.00 < 0.05)
Performance has improved.
filter context u8 high selectivity
time: [1.1673 us 1.1678 us 1.1684 us]
change: [-84.915% -84.905% -84.894%] (p = 0.00 < 0.05)
Performance has improved.
filter context u8 low selectivity
time: [150.77 ns 150.80 ns 150.83 ns]
change: [-82.779% -82.758% -82.735%] (p = 0.00 < 0.05)
Performance has improved.
filter i32 time: [68.835 us 68.857 us 68.882 us]
change: [-78.627% -78.613% -78.598%] (p = 0.00 < 0.05)
Performance has improved.
filter i32 high selectivity
time: [6.1117 us 6.1135 us 6.1154 us]
change: [-54.814% -54.788% -54.759%] (p = 0.00 < 0.05)
Performance has improved.
filter i32 low selectivity
time: [1.3835 us 1.3858 us 1.3895 us]
change: [-41.772% -41.682% -41.570%] (p = 0.00 < 0.05)
Performance has improved.
filter context i32 time: [16.049 us 16.058 us 16.068 us]
change: [-93.552% -93.545% -93.535%] (p = 0.00 < 0.05)
Performance has improved.
filter context i32 high selectivity
time: [4.7336 us 4.7417 us 4.7507 us]
change: [-59.764% -59.459% -58.934%] (p = 0.00 < 0.05)
Performance has improved.
filter context i32 low selectivity
time: [141.69 ns 141.83 ns 141.98 ns]
change: [-83.970% -83.929% -83.895%] (p = 0.00 < 0.05)
Performance has improved.
filter context i32 w NULLs
time: [42.514 us 42.529 us 42.548 us]
change: [-86.620% -86.613% -86.606%] (p = 0.00 < 0.05)
Performance has improved.
filter context i32 w NULLs high selectivity
time: [10.785 us 10.788 us 10.790 us]
change: [+0.8230% +1.0819% +1.2557%] (p = 0.00 < 0.05)
Change within noise threshold.
filter context i32 w NULLs low selectivity
time: [279.94 ns 280.04 ns 280.13 ns]
change: [-68.389% -68.317% -68.242%] (p = 0.00 < 0.05)
Performance has improved.
filter context u8 w NULLs
time: [40.525 us 40.540 us 40.557 us]
change: [-87.204% -87.196% -87.187%] (p = 0.00 < 0.05)
Performance has improved.
filter context u8 w NULLs high selectivity
time: [7.0863 us 7.0895 us 7.0928 us]
change: [+3.4115% +3.4779% +3.5553%] (p = 0.00 < 0.05)
Performance has regressed.
filter context u8 w NULLs low selectivity
time: [270.69 ns 270.79 ns 270.89 ns]
change: [-67.961% -67.906% -67.849%] (p = 0.00 < 0.05)
Performance has improved.
filter f32 time: [120.99 us 121.18 us 121.56 us]
change: [-68.315% -68.281% -68.241%] (p = 0.00 < 0.05)
Performance has improved.
filter context f32 time: [36.935 us 36.951 us 36.968 us]
change: [-88.072% -88.059% -88.045%] (p = 0.00 < 0.05)
Performance has improved.
filter context f32 high selectivity
time: [10.339 us 10.342 us 10.346 us]
change: [-1.7648% -1.4694% -1.2565%] (p = 0.00 < 0.05)
Performance has improved.
filter context f32 low selectivity
time: [287.59 ns 288.40 ns 289.16 ns]
change: [-67.322% -67.087% -66.921%] (p = 0.00 < 0.05)
Performance has improved.
filter context string time: [243.62 us 243.66 us 243.72 us]
change: [-40.960% -40.903% -40.858%] (p = 0.00 < 0.05)
Performance has improved.
filter context string high selectivity
time: [75.645 us 75.664 us 75.682 us]
change: [-81.225% -81.211% -81.199%] (p = 0.00 < 0.05)
Performance has improved.
filter context string low selectivity
time: [666.91 ns 667.33 ns 667.80 ns]
change: [-42.717% -42.635% -42.556%] (p = 0.00 < 0.05)
Performance has improved.
filter context string dictionary
time: [16.348 us 16.354 us 16.359 us]
change: [-89.690% -89.684% -89.677%] (p = 0.00 < 0.05)
Performance has improved.
filter context string dictionary high selectivity
time: [4.8464 us 4.8530 us 4.8617 us]
change: [-83.162% -83.106% -83.056%] (p = 0.00 < 0.05)
Performance has improved.
filter context string dictionary low selectivity
time: [318.97 ns 319.14 ns 319.32 ns]
change: [-49.861% -49.819% -49.770%] (p = 0.00 < 0.05)
Performance has improved.
filter context string dictionary w NULLs
time: [42.654 us 42.726 us 42.806 us]
change: [-87.775% -87.728% -87.657%] (p = 0.00 < 0.05)
Performance has improved.
filter context string dictionary w NULLs high selectivity
time: [11.313 us 11.316 us 11.319 us]
change: [-68.181% -68.152% -68.130%] (p = 0.00 < 0.05)
Performance has improved.
filter context string dictionary w NULLs low selectivity
time: [480.59 ns 480.81 ns 481.04 ns]
change: [-54.683% -54.630% -54.579%] (p = 0.00 < 0.05)
Performance has improved.
filter single record batch
time: [62.108 us 62.210 us 62.292 us]
change: [-80.890% -80.862% -80.837%] (p = 0.00 < 0.05)
Performance has improved.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org