You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Mike Klaas (JIRA)" <ji...@apache.org> on 2007/11/01 23:42:50 UTC

[jira] Closed: (SOLR-375) SpellCheckerRequestHandler improvements to handle multiWords and identify if a word is spelled correctly

     [ https://issues.apache.org/jira/browse/SOLR-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Klaas closed SOLR-375.
---------------------------

    Resolution: Invalid
      Assignee: Mike Klaas

Scott, I worked with Mike to produce a patch that integrates all the new features (frequency, multiwords, thresholding, etc.) into a single patch in SOLR-395.

> SpellCheckerRequestHandler improvements to handle multiWords and identify if a word is spelled correctly
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-375
>                 URL: https://issues.apache.org/jira/browse/SOLR-375
>             Project: Solr
>          Issue Type: Improvement
>          Components: spellchecker
>    Affects Versions: 1.2
>         Environment: Tested using: Windows XP, Apache TomCat v5.5.23, Java JDK 1.5.0_12, Solr v1.2
>            Reporter: Scott Tabar
>            Assignee: Mike Klaas
>             Fix For: 1.3
>
>         Attachments: JIRA_SOLR-375.diff
>
>
> The current implementation of SpellCheckerRequestHandler has some limitations:
> 1. It does not identify if a word is spelled correctly (a match to its index) 
>   a. If a word is spelled correctly, the correct spelling is not included in the suggested list, so the suggestions cannot be used to deduce if the word is correct
>   b. If the word does not exist in the index and there are no suggestions, the suggestion list is empty
> 2. No support for multiple words
> I have made some changes to this class that addresses these limitations:
> 1. the key value pair exists=true/false has been added to provide a clear understanding if the word is in the index or not
> 2. the key value pair words=_words_to_be_checked_ to identify the original word(s) that was checked and for what the suggestion list is for.  This becomes more important for the support of multiple words.
> 3. If a parameter key word on the query string exists with the value of multiWords=true, then support for multiple words is enabled.
>   a. Multiple words are defined by the value of q and are separated by either a space or +
>   b. Each word is has its own entry in a NamedList object so as to group all result attributes back to that word: words=, exist=, and suggestions=
>  
> My intended goals is that these changes should not effect existing implementations of the spell checker within Solr.
> The format of the multiWords support should be easily supported and used within Prototype if the output type is JSon.
> I have made the changes.  I still need to do some basic testing to ensure all is working as it is intended, then I will commit to SVN (within 24 hours?).  When I commit, I will also add more JavaDocs to the class, and also try to attach more comments to this JIRA.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.