You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Mike Klaas (JIRA)" <ji...@apache.org> on 2007/11/01 23:42:50 UTC
[jira] Closed: (SOLR-375) SpellCheckerRequestHandler improvements
to handle multiWords and identify if a word is spelled correctly
[ https://issues.apache.org/jira/browse/SOLR-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mike Klaas closed SOLR-375.
---------------------------
Resolution: Invalid
Assignee: Mike Klaas
Scott, I worked with Mike to produce a patch that integrates all the new features (frequency, multiwords, thresholding, etc.) into a single patch in SOLR-395.
> SpellCheckerRequestHandler improvements to handle multiWords and identify if a word is spelled correctly
> --------------------------------------------------------------------------------------------------------
>
> Key: SOLR-375
> URL: https://issues.apache.org/jira/browse/SOLR-375
> Project: Solr
> Issue Type: Improvement
> Components: spellchecker
> Affects Versions: 1.2
> Environment: Tested using: Windows XP, Apache TomCat v5.5.23, Java JDK 1.5.0_12, Solr v1.2
> Reporter: Scott Tabar
> Assignee: Mike Klaas
> Fix For: 1.3
>
> Attachments: JIRA_SOLR-375.diff
>
>
> The current implementation of SpellCheckerRequestHandler has some limitations:
> 1. It does not identify if a word is spelled correctly (a match to its index)
> a. If a word is spelled correctly, the correct spelling is not included in the suggested list, so the suggestions cannot be used to deduce if the word is correct
> b. If the word does not exist in the index and there are no suggestions, the suggestion list is empty
> 2. No support for multiple words
> I have made some changes to this class that addresses these limitations:
> 1. the key value pair exists=true/false has been added to provide a clear understanding if the word is in the index or not
> 2. the key value pair words=_words_to_be_checked_ to identify the original word(s) that was checked and for what the suggestion list is for. This becomes more important for the support of multiple words.
> 3. If a parameter key word on the query string exists with the value of multiWords=true, then support for multiple words is enabled.
> a. Multiple words are defined by the value of q and are separated by either a space or +
> b. Each word is has its own entry in a NamedList object so as to group all result attributes back to that word: words=, exist=, and suggestions=
>
> My intended goals is that these changes should not effect existing implementations of the spell checker within Solr.
> The format of the multiWords support should be easily supported and used within Prototype if the output type is JSon.
> I have made the changes. I still need to do some basic testing to ensure all is working as it is intended, then I will commit to SVN (within 24 hours?). When I commit, I will also add more JavaDocs to the class, and also try to attach more comments to this JIRA.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.