You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2011/06/16 02:14:11 UTC

Re: Explain the difference in similarity and similarityProvider

Hey Brian,

Catching up on my email from vacation i notice a bunch of questions from 
your about similarity and per-field similarity and the new 
similarityprovider stuff that don't look like they were ever really 
resolved.

A little back ground...

once upon a time, "Similarity" was a global sort of thing in Lucene, where 
a single instance was expected to be used in your entire app, and some 
methods in that API took a field name and some didn't.

LUCENE-2236 changed that, by introducing the "SimilarityProvider" api, 
which is the new "there should be one of these for your app" class that 
handles "global" type things, and it can return "Similarity" objects on a 
per field basis as needed...

http://search-lucene.com/jd/lucene/org/apache/lucene/search/SimilarityProvider.html

SOLR-2338 then added the ability to configure (in schema.xml) 
<similarity/> instances per fieldType, using a SolrSimilarityProvider that 
is managed by Solr's internal "IndexSchema".  It also added a new top 
level <similarityProvider/> decalration to allow users to define a 
SimilarityProviderFactory for specifying a complete SolrSimilarityProvider 
to handle the other non field specific methods.

The reason SolrSimilarityProvider exists (and has a "final" impl of 
get(String field):Similarity) is to ensure that fieldType specific 
<similarity/> declarations will be respected.  the "global" <similarity/> 
(factory) delcaration from older versions of Solr is also still supported 
for people who want to change the Similarity for all fields, but don't 
want to deal with writing a SimilarityProviderFactory.

This is all really new stuff on trunk, so it isn't fully documented yet, 
but you can see configs and tests demonstrating it if you look at the 
patch, or the commit info for the issue...

http://svn.apache.org/viewvc/lucene/dev/trunk/solr/src/test-files/solr/conf/schema.xml?p2=%2Flucene%2Fdev%2Ftrunk%2Fsolr%2Fsrc%2Ftest-files%2Fsolr%2Fconf%2Fschema.xml&p1=%2Flucene%2Fdev%2Ftrunk%2Fsolr%2Fsrc%2Ftest-files%2Fsolr%2Fconf%2Fschema.xml&r1=1087430&r2=1087429&view=diff&pathrev=1087430
http://svn.apache.org/viewvc/lucene/dev/trunk/solr/src/test/org/apache/solr/schema/TestPerFieldSimilarity.java?view=markup&pathrev=1087430

does that all make sense?

If you're still having trouble getting per field similarity stuff to work, 
the best "next step" would probably be to open a jira and post a patch 
showing some simple changes that demonstrait your problem -- either
against the existing tests with new assertions that fail; or against the 
example configs with a description of what URLs you hit, what you got, and 
what you expected to get.



-Hoss