You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Karl Wettin (JIRA)" <ji...@apache.org> on 2007/01/26 15:59:49 UTC
[jira] Updated: (LUCENE-786) Extended javadocs in spellchecker
[ https://issues.apache.org/jira/browse/LUCENE-786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karl Wettin updated LUCENE-786:
-------------------------------
Attachment: spellcheck_javadocs.diff
patch root is trunk/contrib/spellcheck
> Extended javadocs in spellchecker
> ---------------------------------
>
> Key: LUCENE-786
> URL: https://issues.apache.org/jira/browse/LUCENE-786
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Javadocs
> Affects Versions: 2.0.0
> Reporter: Karl Wettin
> Priority: Trivial
> Attachments: spellcheck_javadocs.diff
>
>
> Added some javadocs that explains why the spellchecker does not work as one might expect it to.
> http://www.nabble.com/SpellChecker%3A%3AsuggestSimilar%28%29-Question-tf3118660.html#a8640395
> > Without having looked at the code for a long time, I think the problem is what the
> > lucene scoring consider to be best. First the grams are searched, resulting in a number
> > of hits. Then the edit-distance is calculated on each hit. "Genetics" is appearently the
> > third most similar hit according to Lucene, but the best according to Levenshtein.
> >
> > I.e. Lucene does not use edit-distance as similarity. You need to get a bunch of best hits
> > in order to find the one with the smallest edit-distance.
> I took a look at the code, and my assessment seems to be right.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org