You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michał Dybizbański (JIRA)" <ji...@apache.org> on 2011/06/27 22:19:48 UTC

[jira] [Issue Comment Edited] (LUCENE-2341) explore morfologik integration

    [ https://issues.apache.org/jira/browse/LUCENE-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055729#comment-13055729 ] 

Michał Dybizbański edited comment on LUCENE-2341 at 6/27/11 8:19 PM:
---------------------------------------------------------------------

Dawid, as you suggested, I've changed the interface to MorfologikAnalyzer and MorfologikFilter to account for the changes in Morfologik 1.5.2, namely the multiple dictionaries.
Both those classes' constructors now accept a PolishStemmer.DICTIONARY (instead of languageCode String as in previous patch). A PolishStemmer object is instantiated by MorfologikFilter, so each invocation of MorfologikAnalyzer.createComponents (which instantiates MorfologikFilter) is coupled with an individual instance of PolishStemmer.
This way, sharing a MorfologikAnalyzer by separate threads is safe (even though MorfologikFilter itself isn't thread-safe) provided each thread obtains its own TokenStreamComponents through ReusableAnalyzerBase.createComponents (is this always the case ? looking at other filters, thay don't look thread-safe neither ..)


      was (Author: michcio):
    David, as you suggested, I've changed the interface to MorfologikAnalyzer and MorfologikFilter to account for the changes in Morfologik 1.5.2, namely the multiple dictionaries.
Both those classes' constructors now accept a PolishStemmer.DICTIONARY (instead of languageCode String as in previous patch). A PolishStemmer object is instantiated by MorfologikFilter, so each invocation of MorfologikAnalyzer.createComponents (which instantiates MorfologikFilter) is coupled with an individual instance of PolishStemmer.
This way, sharing a MorfologikAnalyzer by separate threads is safe (even though MorfologikFilter itself isn't thread-safe) provided each thread obtains its own TokenStreamComponents through ReusableAnalyzerBase.createComponents (is this always the case ? looking at other filters, thay don't look thread-safe neither ..)

  
> explore morfologik integration
> ------------------------------
>
>                 Key: LUCENE-2341
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2341
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: Robert Muir
>            Assignee: Dawid Weiss
>         Attachments: LUCENE-2341.diff, LUCENE-2341.diff, LUCENE-2341.diff, morfologik-fsa-1.5.2.jar, morfologik-polish-1.5.2.jar, morfologik-stemming-1.5.0.jar, morfologik-stemming-1.5.2.jar
>
>
> Dawid Weiss mentioned on LUCENE-2298 that there is another Polish stemmer available:
> http://sourceforge.net/projects/morfologik/
> This works differently than LUCENE-2298, and ideally would be another option for users.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org