You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Blargy <zm...@hotmail.com> on 2010/06/25 19:03:44 UTC

SweetSpotSimilarity

Would someone mind explaining how this differs from the DefaultSimilarity?
Also how would one replace the use of the DefaultSimilarity class with this
one? I can't seem to find any such configuration in solrconfig.xml.

Thanks
-- 
View this message in context: http://lucene.472066.n3.nabble.com/SweetSpotSimilarity-tp922546p922546.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SweetSpotSimilarity

Posted by Chris Hostetter <ho...@fucit.org>.
: Side question. How would I know if a configuration option can also take a
: factory class.. like in this instance?

by reading the example schema.xml...

 <!-- Similarity is the scoring routine for each document vs. a query.
      A custom similarity may be specified here, but the default is fine
      for most applications.  -->
 <!-- <similarity class="org.apache.lucene.search.DefaultSimilarity"/> -->
 <!-- ... OR ...
      Specify a SimilarityFactory class name implementation
      allowing parameters to be used.
 -->
 <!--
 <similarity class="com.example.solr.CustomSimilarityFactory">
   <str name="paramkey">param value</str>
 </similarity>
 -->





-Hoss


Re: SweetSpotSimilarity

Posted by Blargy <zm...@hotmail.com>.

iorixxx wrote:
> 
> CustomSimilarityFactory that extends
> org.apache.solr.schema.SimilarityFactory should do it. There is an example
> CustomSimilarityFactory.java under src/test/org...
> 

This is exactly what I was looking for... this is very similar ( no put
intended ;) ) to the updateProcessorFactory configuration in
solr-config.xml. The wiki should probably include this information.

Side question. How would I know if a configuration option can also take a
factory class.. like in this instance?
-- 
View this message in context: http://lucene.472066.n3.nabble.com/SweetSpotSimilarity-tp922546p928862.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SweetSpotSimilarity

Posted by Ahmet Arslan <io...@yahoo.com>.

> How would you configure the tfBaselineTfFactors and
> LengthNormFactors when
> configuring via schema.xml? 

CustomSimilarityFactory that extends org.apache.solr.schema.SimilarityFactory should do it. There is an example CustomSimilarityFactory.java under src/test/org...


      

Re: SweetSpotSimilarity

Posted by Blargy <zm...@hotmail.com>.

iorixxx wrote:
> 
> it is in schema.xml:
> 
> <similarity class="org.apache.lucene.search.SweetSpotSimilarity"/>
> 

How would you configure the tfBaselineTfFactors and LengthNormFactors when
configuring via schema.xml? Do I have to create a subclass that hardcodes
these values?
-- 
View this message in context: http://lucene.472066.n3.nabble.com/SweetSpotSimilarity-tp922546p928730.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SweetSpotSimilarity

Posted by Ahmet Arslan <io...@yahoo.com>.
> Thanks. Im guessing this is all or nothing.. ie you can't
> you one similarity
> class for one request handler and another for a separate
> request handler. Is
> that correct?

correct, also re-index is required. length norms are calculated and stored at index time.


      

Re: SweetSpotSimilarity

Posted by Blargy <zm...@hotmail.com>.

iorixxx wrote:
> 
> it is in schema.xml:
> 
> <similarity class="org.apache.lucene.search.SweetSpotSimilarity"/>
> 

Thanks. Im guessing this is all or nothing.. ie you can't you one similarity
class for one request handler and another for a separate request handler. Is
that correct?



-- 
View this message in context: http://lucene.472066.n3.nabble.com/SweetSpotSimilarity-tp922546p922622.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SweetSpotSimilarity

Posted by Ahmet Arslan <io...@yahoo.com>.
> Would someone mind explaining how this differs from the
> DefaultSimilarity?

The difference is length normalization. Default one punishes long documents.

"Sweet one computes to a constant norm for all lengths in
the [min,max] range (the "sweet spot"), and smaller norm
values for lengths out of this range. Documents shorter or
longer than the sweet spot range are "punished""

Section 4.1 http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf

> Also how would one replace the use of the DefaultSimilarity
> class with this
> one? I can't seem to find any such configuration in
> solrconfig.xml.


it is in schema.xml:

<similarity class="org.apache.lucene.search.SweetSpotSimilarity"/>