You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2015/11/03 15:17:27 UTC

[jira] [Commented] (LUCENE-6880) Add document oriented collector for NRTSuggester

    [ https://issues.apache.org/jira/browse/LUCENE-6880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987327#comment-14987327 ] 

Michael McCandless commented on LUCENE-6880:
--------------------------------------------

The javadoc for TopSuggestDocs.SuggestScoreDoc needs to be fixed (it implies there's only one key now).

When a SuggestScoreDocs has multiple keys/contexts/scores, is there any implication to order?  Is it always sorted "best to worst" score?

I wonder if instead of 3 parallel lists, we should just have a list of SuggestScoreDoc (as it is in trunk today) for each doc hit?  In fact, this is really like grouping?  Maybe it should be a TopGroups?

It should be "fewer" not "less" in here :) : {{// This can still lead to collecting less paths then needed...}}


> Add document oriented collector for NRTSuggester
> ------------------------------------------------
>
>                 Key: LUCENE-6880
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6880
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Areek Zillur
>            Assignee: Areek Zillur
>             Fix For: Trunk, 5.4
>
>         Attachments: LUCENE-6880.patch
>
>
> Currently NRTSuggester collects completions iteratively as they are accepted by the TopNSearcher, implying that a document can be collected more than once. In case of indexing a completion with multiple context values, the completion leads to {{num_context}} paths in the underlying FST for the same docId and gets collected {{num_context}} times, when a query matches all its contexts. 
> Ideally, a document-oriented collector will collect top N documents instead of top N completions by handling the docId deduplication while collecting the completions. This could be used to collect n unique documents that matched a completion query. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org