You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@calcite.apache.org by "Rui Wang (Jira)" <ji...@apache.org> on 2020/08/05 16:24:00 UTC

[jira] [Commented] (CALCITE-4157) Use a different sort algorithm for EnumerableDefaults.orderBy(...)

    [ https://issues.apache.org/jira/browse/CALCITE-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171599#comment-17171599 ] 

Rui Wang commented on CALCITE-4157:
-----------------------------------

For running sort on sorted input, that can be partially solved by checking collation with top-down optimization enabled.

> Use a different sort algorithm for EnumerableDefaults.orderBy(...)
> ------------------------------------------------------------------
>
>                 Key: CALCITE-4157
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4157
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Thomas Rebele
>            Priority: Minor
>
> As shown by [the benchmarks|https://github.com/thomasrebele/jmh-micro-benchmarks/blob/98abccad8801532b78a0778cd7be7bd751f90da6/core-java/doc/jmh_partial_sort_jdk1.8.0_241.txt#L6793] for CALCITE-3920, the sort with a TreeMap is slower than sorting with Arrays.sort(Object[]). The latter takes about 35% less time than sorting with TreeMap over a randomized input. While the implementation is not exactly the same, it should be close enough to be able to say something about the performance of EnumerableDefaults.orderBy(...). The relevant results for this issue are the benchmarks with limit=-1 and algorithms treeMap and collectionSort.
> The speedup might be even better if the input is already sorted, as modern VMs use TimSort, which checks if the input is already sorted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)