You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Michael McCandless (Jira)" <ji...@apache.org> on 2020/09/22 15:18:00 UTC
[jira] [Commented] (LUCENE-9537) Add Indri Search Engine
Functionality to Lucene
[ https://issues.apache.org/jira/browse/LUCENE-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200144#comment-17200144 ]
Michael McCandless commented on LUCENE-9537:
--------------------------------------------
Whoa, cool! Thanks [~cvandenberg]
It looks like you added an Indri specific {{Similarity}}, {{TermQuery}} and {{AndQuery}} and also {{IndriScorer}} and {{IndriWeight}}.
I wonder if we could do a tighter integration, e.g. somehow fold this {{smoothingScore}} into Lucene's {{Similarity}} maybe? Would it allow less forking of these complex Lucene classes?
Why does {{IndriScorer}} need to implement {{public int docID()}}? Shouldn't callers get that from the {{DocIdSetIterator}}?
Also, Lucene has factored out {{float boost}} into a dedicated {{BoostQuery}}.
Could you add some simple class-level javadocs explaining how these queries differ from the corresponding Lucene queries? Maybe we should put the classes under a {{.indri.}} sub-package, or put them in Lucene's {{sandbox}} module for starters?
> Add Indri Search Engine Functionality to Lucene
> -----------------------------------------------
>
> Key: LUCENE-9537
> URL: https://issues.apache.org/jira/browse/LUCENE-9537
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Reporter: Cameron VandenBerg
> Priority: Major
> Labels: patch
> Attachments: LUCENE-INDRI.patch
>
>
> Indri ([http://lemurproject.org/indri.php]) is an academic search engine developed by The University of Massachusetts and Carnegie Mellon University. The major difference between Lucene and Indri is that Indri will give a document a "smoothing score" to a document that does not contain the search term, which has improved the search ranking accuracy in our experiments. I have created an Indri patch, which adds the search code needed to implement the Indri AND logic as well as Indri's implementation of Dirichlet Smoothing.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org