You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michał Dybizbański (JIRA)" <ji...@apache.org> on 2011/06/27 22:19:48 UTC
[jira] [Issue Comment Edited] (LUCENE-2341) explore morfologik
integration
[ https://issues.apache.org/jira/browse/LUCENE-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055729#comment-13055729 ]
Michał Dybizbański edited comment on LUCENE-2341 at 6/27/11 8:19 PM:
---------------------------------------------------------------------
Dawid, as you suggested, I've changed the interface to MorfologikAnalyzer and MorfologikFilter to account for the changes in Morfologik 1.5.2, namely the multiple dictionaries.
Both those classes' constructors now accept a PolishStemmer.DICTIONARY (instead of languageCode String as in previous patch). A PolishStemmer object is instantiated by MorfologikFilter, so each invocation of MorfologikAnalyzer.createComponents (which instantiates MorfologikFilter) is coupled with an individual instance of PolishStemmer.
This way, sharing a MorfologikAnalyzer by separate threads is safe (even though MorfologikFilter itself isn't thread-safe) provided each thread obtains its own TokenStreamComponents through ReusableAnalyzerBase.createComponents (is this always the case ? looking at other filters, thay don't look thread-safe neither ..)
was (Author: michcio):
David, as you suggested, I've changed the interface to MorfologikAnalyzer and MorfologikFilter to account for the changes in Morfologik 1.5.2, namely the multiple dictionaries.
Both those classes' constructors now accept a PolishStemmer.DICTIONARY (instead of languageCode String as in previous patch). A PolishStemmer object is instantiated by MorfologikFilter, so each invocation of MorfologikAnalyzer.createComponents (which instantiates MorfologikFilter) is coupled with an individual instance of PolishStemmer.
This way, sharing a MorfologikAnalyzer by separate threads is safe (even though MorfologikFilter itself isn't thread-safe) provided each thread obtains its own TokenStreamComponents through ReusableAnalyzerBase.createComponents (is this always the case ? looking at other filters, thay don't look thread-safe neither ..)
> explore morfologik integration
> ------------------------------
>
> Key: LUCENE-2341
> URL: https://issues.apache.org/jira/browse/LUCENE-2341
> Project: Lucene - Java
> Issue Type: New Feature
> Components: modules/analysis
> Reporter: Robert Muir
> Assignee: Dawid Weiss
> Attachments: LUCENE-2341.diff, LUCENE-2341.diff, LUCENE-2341.diff, morfologik-fsa-1.5.2.jar, morfologik-polish-1.5.2.jar, morfologik-stemming-1.5.0.jar, morfologik-stemming-1.5.2.jar
>
>
> Dawid Weiss mentioned on LUCENE-2298 that there is another Polish stemmer available:
> http://sourceforge.net/projects/morfologik/
> This works differently than LUCENE-2298, and ideally would be another option for users.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org