You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:37:06 UTC
[jira] [Closed] (NUTCH-72) Query basic filter with correction
feature
[ https://issues.apache.org/jira/browse/NUTCH-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markus Jelsma closed NUTCH-72.
------------------------------
Resolution: Won't Fix
> Query basic filter with correction feature
> ------------------------------------------
>
> Key: NUTCH-72
> URL: https://issues.apache.org/jira/browse/NUTCH-72
> Project: Nutch
> Issue Type: New Feature
> Components: searcher
> Environment: lucene
> Reporter: Christophe Noel
> Attachments: querycorrectionplugin.zip
>
>
> This plugin improves query-basic plugin with a correction feature.
> Lucene includes FuzzyQuery feature which consists of searching not only for matching terms, but searching for very similar terms too.
> This plugin should be used instead of query-basic, for people looking for an easy solution about users query requests correction.
> Correction Query Plugin can be used as follows :
> Solution 1 : If you want to search for very similar terms, add autocorrectionmod as the first term of the query (example : 'nutch engine' -> 'autocorrectionmod nutch engine')
> Solution 2 : Create a new search.jsp page which include a "correction" checkbox management (<input type="checkbox" name="autocorrection" value="true"> may automatically add 'autocorrectionmod' as the first term of the query)
> QueryFuzzy knows a big problem : it is very slow for large index !
> So Correction Query Plugin works as follows :
> - it is not useful for big indexes
> - it only works for 5 characters and more words
> - it only look for words matching with the 2 first characters (to improve performance this should be set to 3/4)
> - it only works for 65 % matching suffixes (algorithm is levenstein)
> PLease give your opinion about it.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira