You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2018/12/11 09:39:00 UTC

[jira] [Commented] (LUCENE-8600) DocValuesFieldUpdates should use a better sort

    [ https://issues.apache.org/jira/browse/LUCENE-8600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16716669#comment-16716669 ] 

Adrien Grand commented on LUCENE-8600:
--------------------------------------

Here is a patch. I initially proposed TimSorter, but the impact on the API in not nice since you need to add at least one additional API to DocValuesFieldUpdates and its sub classes to support copying from one slot to another (rather than swapping). So I instead went with IntroSorter (a quicksort variant) and recorded ords of each update to guarantee stability. The speedup is less than what I got with TimSort but I like that it is better contained. When running the benchmark that Simon shared, updates on random longs run about 2.3x faster.

> DocValuesFieldUpdates should use a better sort
> ----------------------------------------------
>
>                 Key: LUCENE-8600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8600
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-8600.patch
>
>
> This is a follow-up to LUCENE-8598: Simon identified that swaps are a bottleneck to applying doc-value updates, in particular due to the overhead of packed ints. It turns out that InPlaceMergeSorter does LOTS of swaps in order to perform in-place. Replacing with a more efficient sort should help.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org