You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by "Ben West (JIRA)" <ji...@apache.org> on 2010/05/03 20:25:55 UTC
[jira] Issue Comment Edited: (LUCENENET-366) Spellchecker issues
[ https://issues.apache.org/jira/browse/LUCENENET-366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863448#action_12863448 ]
Ben West edited comment on LUCENENET-366 at 5/3/10 2:24 PM:
------------------------------------------------------------
Hey DIGY,
Java lucene doesn't have the duplicate checking - should I submit a bug to them?
EDIT: bah, I take it back. it does. Will work on porting.
was (Author: xodarap):
Hey DIGY,
Java lucene doesn't have the duplicate checking - should I submit a bug to them?
> Spellchecker issues
> -------------------
>
> Key: LUCENENET-366
> URL: https://issues.apache.org/jira/browse/LUCENENET-366
> Project: Lucene.Net
> Issue Type: Bug
> Reporter: Ben West
> Priority: Minor
> Attachments: LuceneNet-SpellcheckFixes.patch
>
>
> There are several issues with the spellchecker:
> - It doesn't do duplicate checking across updates (so the same word is often indexed many, many times)
> - The n-gram fields are stored as well as indexed, which increases the size of the index by several orders of magnitude and provides no benefit
> - Some deprecated functions are used, which slows it down
> - Some methods aren't commented fully
> I will attach a patch that fixes these.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.