You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "benwang li (Jira)" <ji...@apache.org> on 2021/02/28 13:54:00 UTC
[jira] [Updated] (ARROW-11630) [rust] sort performance
[ https://issues.apache.org/jira/browse/ARROW-11630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
benwang li updated ARROW-11630:
-------------------------------
Summary: [rust] sort performance (was: [rust] use pdqsort to improve the sort performance for value_indices)
> [rust] sort performance
> -----------------------
>
> Key: ARROW-11630
> URL: https://issues.apache.org/jira/browse/ARROW-11630
> Project: Apache Arrow
> Issue Type: Bug
> Reporter: benwang li
> Priority: Major
>
> This is already done in ClickHouse, [https://github.com/ClickHouse/ClickHouse/blob/f669a9f97ad850edb77d10e51cd0c41a4af737bf/src/Columns/ColumnVector.cpp#L188-L191]
> In my server benchmark, it proves out to be better performance to use pdqsort for value_indices.
>
> {code:java}
> //代码占位符
>
> cargo bench --bench sort_kernel
> Gnuplot not found, using plotters backend
> sort 2^10 time: [226.90 us 227.32 us 227.78 us]
> change: [-12.867% -12.670% -12.463%] (p = 0.00 < 0.05)
> Performance has improved.
> Found 3 outliers among 100 measurements (3.00%)
> 3 (3.00%) high mildBenchmarking sort 2^12: Warming up for 3.0000 s
> Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.7s, enable flat sampling, or reduce sample count to 60.
> sort 2^12 time: [1.1264 ms 1.1280 ms 1.1298 ms]
> change: [-13.544% -13.131% -12.651%] (p = 0.00 < 0.05)
> Performance has improved.
> Found 8 outliers among 100 measurements (8.00%)
> 3 (3.00%) high mild
> 5 (5.00%) high severesort nulls 2^10 time: [172.19 us 178.21 us 184.20 us] ^[[A^[[A^[[A^[[A^[[B^B[
> change: [-24.136% -21.848% -19.509%] (p = 0.00 < 0.05)
> Performance has improved.
> Found 3 outliers among 100 measurements (3.00%)
> 3 (3.00%) high mild
>
> sort nulls 2^12 time: [760.54 us 762.53 us 764.95 us]
> change: [-29.358% -28.483% -27.749%] (p = 0.00 < 0.05)
> Performance has improved.
>
> Gnuplot not found, using plotters backend
> sort 2^10 time: [226.90 us 227.32 us 227.78 us]
> change: [-12.867% -12.670% -12.463%] (p = 0.00 < 0.05)
> Performance has improved.
> Found 3 outliers among 100 measurements (3.00%)
> 3 (3.00%) high mildBenchmarking sort 2^12: Warming up for 3.0000 s
> Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.7s, enable flat sampling, or reduce sample count to 60.
> sort 2^12 time: [1.1264 ms 1.1280 ms 1.1298 ms]
> change: [-13.544% -13.131% -12.651%] (p = 0.00 < 0.05)
> Performance has improved.
> Found 8 outliers among 100 measurements (8.00%)
> 3 (3.00%) high mild
> 5 (5.00%) high severesort nulls 2^10 time: [172.19 us 178.21 us 184.20 us]
> change: [-24.136% -21.848% -19.509%] (p = 0.00 < 0.05)
> Performance has improved.
> Found 3 outliers among 100 measurements (3.00%)
> 3 (3.00%) high mild
>
> sort nulls 2^12 time: [760.54 us 762.53 us 764.95 us]
> change: [-29.358% -28.483% -27.749%] (p = 0.00 < 0.05)
> Performance has improved.{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)