You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "benwang li (Jira)" <ji...@apache.org> on 2021/02/28 13:54:00 UTC

[jira] [Updated] (ARROW-11630) [rust] sort performance

     [ https://issues.apache.org/jira/browse/ARROW-11630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

benwang li updated ARROW-11630:
-------------------------------
    Summary: [rust] sort performance  (was: [rust] use pdqsort to improve the sort performance for value_indices)

> [rust] sort performance
> -----------------------
>
>                 Key: ARROW-11630
>                 URL: https://issues.apache.org/jira/browse/ARROW-11630
>             Project: Apache Arrow
>          Issue Type: Bug
>            Reporter: benwang li
>            Priority: Major
>
> This is already done in ClickHouse, [https://github.com/ClickHouse/ClickHouse/blob/f669a9f97ad850edb77d10e51cd0c41a4af737bf/src/Columns/ColumnVector.cpp#L188-L191]
> In my server benchmark, it proves out to be better performance to use pdqsort for value_indices.
>  
> {code:java}
> //代码占位符
>   
> cargo bench --bench  sort_kernel
> Gnuplot not found, using plotters backend
> sort 2^10               time:   [226.90 us 227.32 us 227.78 us]
>                         change: [-12.867% -12.670% -12.463%] (p = 0.00 < 0.05)
>                         Performance has improved.
> Found 3 outliers among 100 measurements (3.00%)
>   3 (3.00%) high mildBenchmarking sort 2^12: Warming up for 3.0000 s
> Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.7s, enable flat sampling, or reduce sample count to 60.
> sort 2^12               time:   [1.1264 ms 1.1280 ms 1.1298 ms]
>                         change: [-13.544% -13.131% -12.651%] (p = 0.00 < 0.05)
>                         Performance has improved.
> Found 8 outliers among 100 measurements (8.00%)
>   3 (3.00%) high mild
>   5 (5.00%) high severesort nulls 2^10         time:   [172.19 us 178.21 us 184.20 us]                            ^[[A^[[A^[[A^[[A^[[B^B[
>                         change: [-24.136% -21.848% -19.509%] (p = 0.00 < 0.05)
>                         Performance has improved.
> Found 3 outliers among 100 measurements (3.00%)
>   3 (3.00%) high mild
>   
> sort nulls 2^12         time:   [760.54 us 762.53 us 764.95 us]
>                         change: [-29.358% -28.483% -27.749%] (p = 0.00 < 0.05)
>                         Performance has improved.
>   
> Gnuplot not found, using plotters backend
> sort 2^10               time:   [226.90 us 227.32 us 227.78 us]
>                         change: [-12.867% -12.670% -12.463%] (p = 0.00 < 0.05)
>                         Performance has improved.
> Found 3 outliers among 100 measurements (3.00%)
>   3 (3.00%) high mildBenchmarking sort 2^12: Warming up for 3.0000 s
> Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.7s, enable flat sampling, or reduce sample count to 60.
> sort 2^12               time:   [1.1264 ms 1.1280 ms 1.1298 ms]
>                         change: [-13.544% -13.131% -12.651%] (p = 0.00 < 0.05)
>                         Performance has improved.
> Found 8 outliers among 100 measurements (8.00%)
>   3 (3.00%) high mild
>   5 (5.00%) high severesort nulls 2^10         time:   [172.19 us 178.21 us 184.20 us]
>                         change: [-24.136% -21.848% -19.509%] (p = 0.00 < 0.05)
>                         Performance has improved.
> Found 3 outliers among 100 measurements (3.00%)
>   3 (3.00%) high mild
>   
> sort nulls 2^12         time:   [760.54 us 762.53 us 764.95 us]
>                         change: [-29.358% -28.483% -27.749%] (p = 0.00 < 0.05)
>                         Performance has improved.{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)