You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Ken Krugler <kk...@transpac.com> on 2007/10/18 03:44:35 UTC

Re: alternative scoring algorithm for PhraseQuery

Hi Philipp,

At 10:49 pm +0100 3/7/07, Paul Elschot wrote:
>On Wednesday 07 March 2007 18:12, Philipp Nanz wrote:
>>  Thanks for your answers. Your input is really appreciated :-)
>>
>>  @Paul Elschot:
>>  Thanks for the hint. I guess I could use coord() to penalize missing
>>  terms like this:
>>
>>  Query: a b c d
>>  Doc A: a b c d => sloppyFreq(0) * coord(4, 4) = 1
>>  Doc B: a b c => sloppyFreq(0) * coord(3, 4) = 0,75
>>
>>  Doc would score higher. I guess that might be a valid solution.
>>
>>  There is a drawback though, i.e. sloppyFreq(1) * coord(4, 4) = 0,5
>>
>>  So a perfect match with one insertion would score less than a 3 of 4
>>  match with no slop.
>
>Your examples are based on DefaultSimilarity.
>With a  Similarity in your Scorer you can leave the tradeoff between these
>factors to the user of your query by letting them provide the Similarity
>at query time.

[snip]

I'm curious if Paul's input here helped you finish your 
FuzzyPhraseQuery (or FuzzySpanQuery) addition to Lucene.

Thanks,

-- Ken
-- 
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"If you can't find it, you can't fix it"

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org