You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ken Krugler <kk...@transpac.com> on 2007/10/18 03:44:35 UTC
Re: alternative scoring algorithm for PhraseQuery
Hi Philipp,
At 10:49 pm +0100 3/7/07, Paul Elschot wrote:
>On Wednesday 07 March 2007 18:12, Philipp Nanz wrote:
>> Thanks for your answers. Your input is really appreciated :-)
>>
>> @Paul Elschot:
>> Thanks for the hint. I guess I could use coord() to penalize missing
>> terms like this:
>>
>> Query: a b c d
>> Doc A: a b c d => sloppyFreq(0) * coord(4, 4) = 1
>> Doc B: a b c => sloppyFreq(0) * coord(3, 4) = 0,75
>>
>> Doc would score higher. I guess that might be a valid solution.
>>
>> There is a drawback though, i.e. sloppyFreq(1) * coord(4, 4) = 0,5
>>
>> So a perfect match with one insertion would score less than a 3 of 4
>> match with no slop.
>
>Your examples are based on DefaultSimilarity.
>With a Similarity in your Scorer you can leave the tradeoff between these
>factors to the user of your query by letting them provide the Similarity
>at query time.
[snip]
I'm curious if Paul's input here helped you finish your
FuzzyPhraseQuery (or FuzzySpanQuery) addition to Lucene.
Thanks,
-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"If you can't find it, you can't fix it"
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org