You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Christian Aschoff <ch...@uni-ulm.de> on 2007/10/12 15:48:23 UTC
Problems with stemming/SpellChecker
Hi,
i tried to implement a 'did you mean'-function (and successed in some
way). But the hints from the SpellChecker are the stemmed versions of
the keywords.
For example, the search for the wrong word 'wasseraalfingen' results
in the hint 'wasseralfing' but should be 'wasseralfingen'. My first
try was to use a field with the option 'NO_NORMS' but that did not
worked.
[...]
indexWriter = new IndexWriter(MiscConstants.luceneDir,
new GermanAnalyzer(), create);
[...]
Field didyoumean = new Field
(LuceneFieldNames.didyoumean, content.toString(), Field.Store.YES,
Field.Index.NO_NORMS, Field.TermVector.NO);
document.add(didyoumean);
[...]
IndexReader indexReader = IndexReader.open
(MiscConstants.luceneDir);
SpellChecker spellChecker = new SpellChecker
(FSDirectory.getDirectory(MiscConstants.luceneDYMDir));
spellChecker.indexDictionary(new LuceneDictionary
(indexReader, LuceneFieldNames.didyoumean));
indexReader.close();
[...]
Has anyone a suggestion how i can get 'unstemmed' hints from
SpellChecker? Do i have to create some kind of 'unstemmed' index just
for the creation of the SpellCheckers-index?
Regards,
Christian Aschoff
---
Dipl. Ing. (FH) Christian Aschoff
Büro:
Universität Ulm
Kommunikations- und Informationszentrum
Abt. Informationssysteme
Raum O26/5403
Albert-Einstein-Allee 11
89081 Ulm
Tel. 0731 50-22432
Fax. 0731 50-22471
christian.aschoff@uni-ulm.de
Privat:
Fabristr. 13
89075 Ulm
Deutschland/Old Europe
Tel. 0731 602 803 60
Fax. 0731 602 803 61
Mob. 0171 272 03 04
caschoff@mac.com
Helfen Sie mit: www.meyers-konversationslexikon.de
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Problems with stemming/SpellChecker
Posted by Daniel Naber <lu...@danielnaber.de>.
On Saturday 13 October 2007 07:57, Christian Aschoff wrote:
> But as fare as i see (in the API DOC), the GermanAnalyzer is attached
> to the IndexWriter, i can't find an way to attach an analyzer it to a
> single field... Or do i miss something?
See PerFieldAnalyzerWrapper.
Regards
Daniel
--
http://www.danielnaber.de
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Problems with stemming/SpellChecker
Posted by Christian Aschoff <ch...@uni-ulm.de>.
But as fare as i see (in the API DOC), the GermanAnalyzer is attached
to the IndexWriter, i can't find an way to attach an analyzer it to a
single field... Or do i miss something? (There are tons of other
fields in the index where GermanAnalyzer fits perfect).
Am 12.10.2007 um 19:01 schrieb Daniel Naber:
> On Friday 12 October 2007 15:48, Christian Aschoff wrote:
>
>> indexWriter = new IndexWriter(MiscConstants.luceneDir,
>> new GermanAnalyzer(), create);
>> [...]
>
> Not NO_NORMS is the problem but GermanAnalyzer. Try
> StandardAnalyzer on the
> field you get the suggestions from.
>
> Regards
> Daniel
>
> --
> http://www.danielnaber.de
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---
Dipl. Ing. (FH) Christian Aschoff
Büro:
Universität Ulm
Kommunikations- und Informationszentrum
Abt. Informationssysteme
Raum O26/5403
Albert-Einstein-Allee 11
89081 Ulm
Tel. 0731 50-22432
Fax. 0731 50-22471
christian.aschoff@uni-ulm.de
Privat:
Fabristr. 13
89075 Ulm
Deutschland/Old Europe
Tel. 0731 602 803 60
Fax. 0731 602 803 61
Mob. 0171 272 03 04
caschoff@mac.com
Helfen Sie mit: www.meyers-konversationslexikon.de
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Problems with stemming/SpellChecker
Posted by Daniel Naber <lu...@danielnaber.de>.
On Friday 12 October 2007 15:48, Christian Aschoff wrote:
> indexWriter = new IndexWriter(MiscConstants.luceneDir,
> new GermanAnalyzer(), create);
> [...]
Not NO_NORMS is the problem but GermanAnalyzer. Try StandardAnalyzer on the
field you get the suggestions from.
Regards
Daniel
--
http://www.danielnaber.de
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org