You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Grant Ingersoll (JIRA)" <ji...@apache.org> on 2010/08/17 22:31:18 UTC

[jira] Updated: (LUCENE-2479) need the ability to also sort SpellCheck results by freq, instead of just by Edit Distance+freq

     [ https://issues.apache.org/jira/browse/LUCENE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated LUCENE-2479:
------------------------------------

    Attachment: LUCENE-2479.patch

Patch that implements the comparator approach.  I didn't incorporate the freq into the scoring b/c this would mean having to look up the freq. for every suggestion, which I think would be pretty bad performance-wise.

I also refactored the Solr SpellCheckComponent a little bit to not have a copy and paste of the SuggestWord* classes.  I intend to commit today or tomorrow.  All tests pass and it is back compatible.  I will also port back to 3.x

> need the ability to also sort SpellCheck results by freq, instead of just by Edit Distance+freq
> -----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2479
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2479
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/spellchecker
>         Environment: all
>            Reporter: Gerald DeConto
>            Assignee: Grant Ingersoll
>         Attachments: LUCENE-2479.patch
>
>
> This issue was first noticed and reported in this Solr thread; http://lucene.472066.n3.nabble.com/spellcheck-issues-td489776.html#a489788
> Basically, there are situations where it would be useful to sort by freq first, instead of the current "sort by edit distance, and then subsort by freq if edit distance is equal"
> The author of the thread suggested "What I think would work even better than allowing a custom compareTo function would be to incorporate the frequency directly into the distance function.  This would allow for greater control over the trade-off between frequency and edit distance"
> However, custom compareTo functions are not always be possible (ie if a certain version of Lucene must be used, because it was release with Solr) and incorporating freq directly into the distance function may be overkill (ie depending on the implementation)
> it is suggested that we have a simple modification of the existing compareTo function in Lucene to allow users to specify if they want the existing sort method or if they want to sort by freq.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org