You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Jan Høydahl (JIRA)" <ji...@apache.org> on 2018/12/07 13:10:00 UTC

[jira] [Commented] (SOLR-13025) SchemaSimilarityFactory fallback to LegacyBM25Similarity

    [ https://issues.apache.org/jira/browse/SOLR-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16712836#comment-16712836 ] 

Jan Høydahl commented on SOLR-13025:
------------------------------------

I changed my implementation to not try to be clever if people have explicitly chosen {{BM25SimilarityFactory}} in schema. Please see   [GitHub Pull Request #518|https://github.com/apache/lucene-solr/pull/518] for reviewing the changes:
 * {{BM25SimilarityFactory}} always creates instances of new {{BM25Similarity}}
 * New {{LegacyBM25SimilarityFactory }}to be able to explicitly fall back
 * {{SchemaSimilarityFactory}} creates {{BM25Similarity}} from luceneMatchVersion>=8.0, else {{LegacyBM25Similarity}}
 * Update tests relying on exact score

Upgrade notes reads:
{noformat}
* If you explicitly use BM25SimilarityFactory in your schema the absolute scoring will be lower, see SOLR-13025.
 But ordering of documents will not change in the normal case. Use LegacyBM25SimilarityFactory if you need to force
 the old 6.x/7.x scoring. Note that if you have not specified any similarity in schema or use the default
 SchemaSimilarityFactory, then LegacyBM25Similarity is automatically selected for 'luceneMatchVersion' < 8.0.0.
 See also explanation in Reference Guide chapter "Other Schema Elements".{noformat}
Precommit passes, as do the Solr test suite (incredible!)

Reviews welcome. Plan to commit on Wednesday.

> SchemaSimilarityFactory fallback to LegacyBM25Similarity
> --------------------------------------------------------
>
>                 Key: SOLR-13025
>                 URL: https://issues.apache.org/jira/browse/SOLR-13025
>             Project: Solr
>          Issue Type: Task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: search
>    Affects Versions: master (8.0)
>            Reporter: Adrien Grand
>            Assignee: Jan Høydahl
>            Priority: Blocker
>             Fix For: master (8.0)
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is a follow-up of LUCENE-8563: Lucene changed its BM25Similarity implementation to no longer multiply all scores by (k1 + 1). Solr was left unchanged by replacing uses of BM25Similarity with LegacyBM25Similarity which returns the same scores as in 7.x.
> This Jira makes the default similarity depend on {{luceneMatchVersion}} for back-compat if schema either does not define a similarity of defines {{SchemaSimilarityFactory}}. If a user has explicitly defined {{BM25SimilarityFactory}} then the new will be used and she will need to replace with {{LegacyBM25SimilarityFactory}} if she wants to keep old absolute scores (most often not necessary).
> This change is also described in RefGuide and CHANGES.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org