You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2015/09/25 14:48:04 UTC

[jira] [Updated] (LUCENE-6766) Make index sorting a first-class citizen

     [ https://issues.apache.org/jira/browse/LUCENE-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrien Grand updated LUCENE-6766:
---------------------------------
    Attachment: LUCENE-6766.patch

Here is a first prototype that:
 - moves sorting logic from misc to core
 - removes SortingMergePolicy
 - adds an "indexSort" parameter to IndexWriterConfig and SegmentInfo, with null meaning that the index order is unspecified
 - SimpleTextCodec (de)serializes this indexOrder parameter, other codecs ignore it for now
 - refactors a bit the doc ID remapping logic in IndexWriter when there have been deletions while some segments were being merged

Open question: how should we serialize the SortField objects? Should we have a fixed list of supported SortField parameters or should we allow SortField parameters to serialize themselves?

There are lots of things we could do on the search side, but for now I'd like to focus on the indexing side and making sure the sort order of segments is easily accessible.

> Make index sorting a first-class citizen
> ----------------------------------------
>
>                 Key: LUCENE-6766
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6766
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-6766.patch
>
>
> Today index sorting is a very expert feature. You need to use a custom merge policy, custom collectors, etc. I would like to explore making it a first-class citizen so that:
>  - the sort order could be configured on IndexWriterConfig
>  - segments would record the sort order that was used to write them
>  - IndexSearcher could automatically early terminate when computing top docs on a sort order that is a prefix of the sort order of a segment (and if the user is not interested in totalHits).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org