You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/12/03 10:16:36 UTC

[jira] [Resolved] (STANBOL-1229) Convert all OpenNLP Enhancement Engines to Configuration Factories

     [ https://issues.apache.org/jira/browse/STANBOL-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rupert Westenthaler resolved STANBOL-1229.
------------------------------------------

    Resolution: Fixed

fixed with http://svn.apache.org/r1547321 for both trunk and 0.12

> Convert all OpenNLP Enhancement Engines to Configuration Factories
> ------------------------------------------------------------------
>
>                 Key: STANBOL-1229
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1229
>             Project: Stanbol
>          Issue Type: Improvement
>          Components: Enhancement Engines
>    Affects Versions: 0.12.0
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>            Priority: Minor
>             Fix For: 0.12.0
>
>
> Currently the OpenNLP Sentence Detection and Tokenizer Enhancement Engines do not support OSGI Configuration Factories. Because of that they do only allow a single instance.
> However this can create problems if one wants to configure multiple Enhancement Chains with different NLP frameworks. 
> Here an example
> Chain1:
>  * OpenNLP for English, German and Spanish
> Chain2:
>  * Stanford NLP for English
>  * OpenNLP for German
>  * Freeling NLP for Spanish
> As OpenNLP does support all three mentioned languages a user would like to configure the following Engines configurations for OpenNLP:
> 1. OpenNLP engines for sentence detection, tokenization, POS tagging and Chunking that include all three languages.
> 2. OpenNLP engines that only process German language texts for sentence detection, tokenization, POS tagging and Chunking
> 3. RESTful NLP Analysis Engine calling StanfordNLP for English language texts
> 4. RESTful NLP Analysis Engine calling Freeling for Spanish language texts
> Chain1 would use the OpenNLP engines configured to process all languages while Chain 2 would use the engine configurations listed under point 2 to 4.
> However as the OpenNLP Tokenizer and Sentence detection engine do not support OSGI Configuration Factories this is currently not possible as only a single Engine instance of those two engines can be configured.
> Because of that English and Spanish Text sent to Chain2 would be processed by two Sentence Detectors and Tokenizers and this results in duplicate Sentence and Token annotations.
> Adding support for OSGI Configuration Factories to all OpenNLP EnhancementEngines will solve this issue. Existing Configurations will be not affected as all engines do already use "ConfigurationPolicy.OPTIONAL" - meaning that a default instance with the default configuration is created automatically.
> This Issues affects both the trunk as well as the 0.12 releasing branch



--
This message was sent by Atlassian JIRA
(v6.1#6144)