You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Timo Nentwig <lu...@nitwit.de> on 2007/12/31 16:01:11 UTC
Fuzzy makes no sense for short tokens
Hi!
it generally makes no sense to search fuzzy for short tokens because changing
even only a single character of course already results in a high edit
distance. So it actually only makes sense in this case:
if( token.length() > 1f / (1f - minSimilarity) )
E.g. changing one character in a 3-letter token (foo) results in an edit
distance of 0.6. And if minSimilarity (which is by default: 0.5 :-) is higher
we can save all the expensive rewrite() logic.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org