You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2010/07/09 22:59:40 UTC

Re: Custom PhraseQuery

: Query: "foo bar"
: Doc1: "foo bar baz"
: Doc2: "foo bar foo bar"
: 
: These two documents should be scored exactly the same. I accomplished the
: above in the "normal" query use-case by using the SweetSpotSimilarity class.

You can change this by subclassing SweetSpotSimilarity (or any Similarity 
class) and overridding the tf(float) function.  

tf(int) is called for terms, while tf(float) is called for for phrases 
-- the float value is lower for phrases with a lot of slop, and higher for 
exact matches.

unfortunately, the input to tf(float) is lossy in accounting for docs 
htat match the phrase multiple times ... the value of "1.0f" 
might mean it mathes the phrase once exactly, or it might mean thta it 
matches many times in a sloppy manner.

in your case, it sounds like you just want it to return "1" for any input 
except "0.0f"



-Hoss

Re: Custom PhraseQuery

Posted by Chris Hostetter <ho...@fucit.org>.

: It sounds like all I need to do is actually override tf(float) in the
: SweetSpotSimilarity class to delegate to baselineTF just like tf(int) does.
: Is this correct?

you have to decide how you want to map the float->int (ie: round, 
truncate, etc...) but otherwise: yes that should work fine.



-Hoss

Re: Custom PhraseQuery

Posted by Blargy <zm...@hotmail.com>.

Oh.. i didnt know about the different signatures to tf. Thanks for that
clarification.

It sounds like all I need to do is actually override tf(float) in the
SweetSpotSimilarity class to delegate to baselineTF just like tf(int) does.
Is this correct?

Thanks
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Custom-PhraseQuery-tp932414p955257.html
Sent from the Solr - User mailing list archive at Nabble.com.