You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2013/05/02 23:46:17 UTC

[jira] [Updated] (LUCENE-4946) Refactor SorterTemplate

     [ https://issues.apache.org/jira/browse/LUCENE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrien Grand updated LUCENE-4946:
---------------------------------

    Attachment: LUCENE-4946.patch

This patch contains one base class Sorter and 3 implementations:
 * IntroSorter (improved quicksort like we had before but I think the name is better since it makes it clear that the worst case complexity is O(n ln(n)) instead of O(n^2) as with traditional quicksort
 * InPlaceMergeSort, the merge sort we had before.
 * TimSort, an improved version of the previous implementation that can gallop to make sorting even faster on partially-sorted data.

One major difference is that the end offsets are now exclusive. I tend to find it less confusing since you would now call {{sort(0, array.length)}} instead of {{sort(0, array.length - 1)}}.

Please let me know if you would like to review the patch!
                
> Refactor SorterTemplate
> -----------------------
>
>                 Key: LUCENE-4946
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4946
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Trivial
>         Attachments: LUCENE-4946.patch
>
>
> When working on TimSort (LUCENE-4839), I was a little frustrated of not being able to add galloping support because it would have required to add new primitive operations in addition to compare and swap.
> I started working on a prototype that uses inheritance to allow some sorting algorithms to rely on additional primitive operations. You can have a look at https://github.com/jpountz/sorts/tree/master/src/java/net/jpountz/sorts (but beware it is a prototype and still misses proper documentation and good tests).
> I think it would offer several advantages:
>  - no more need to implement setPivot and comparePivot when using in-place merge sort or insertion sort,
>  - the ability to use faster stable sorting algorithms at the cost of some memory overhead (our in-place merge sort is very slow),
>  - the ability to implement properly algorithms that are useful on specific datasets but require different primitive operations (such as TimSort for partially-sorted data).
> If you are interested in comparing these implementations with Arrays.sort, there is a Benchmark class in src/examples.
> What do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org