You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Jörn Kottmann (JIRA)" <ji...@apache.org> on 2011/02/01 13:20:29 UTC

[jira] Commented: (OPENNLP-17) Add support for custom feature generator configuration embedded in the model package

    [ https://issues.apache.org/jira/browse/OPENNLP-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989176#comment-12989176 ] 

Jörn Kottmann commented on OPENNLP-17:
--------------------------------------

Previously we came up with a few ways to solve this issue, but could never agree on which way to go.

1. Dependency injection. The model contains a file which describes how the feature generators are constructed and the features generators are instantiated and put together by a standard dependency injection framework. Back then it was proposed to use spring. Advantage is that the format is well known by our users. Disadvantage is the addition of an external dependency.  

2. Embed a custom xml descriptor which describes how to put the feature generators together. Advantage is that we do not need to depend on an external dependency injection framework. Disadvantage is that we need to define, document and maintain a custom xml format.

3. Place a javascript file in the model which is capable of constructing the feature generators. Disadvantage security might be a problem and it depends on Java6. Advantage user can implement simple feature generators for research purposes in javascript.

> Add support for custom feature generator configuration embedded in the model package
> ------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-17
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-17
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Chunker, Name Finder, POS Tagger
>    Affects Versions: tools-1.5.0-sourceforge
>            Reporter: Jörn Kottmann
>
> Add support for custom feature generator configuration embedded in the model package.
> The configuration of the feature generators for the name finder component can be quite complex and the configuration must
> be always done twice once for training and once for tagging. Doing it twice at two different points in time makes
> the feature generation very error prone. Small mistakes lead to a drop in detection performance which might
> be difficult to notice. 
> To solve this issue add the configuration to the model, then it must only be specified during training and
> can be loaded from the model during tagging.
> Another advantage is that custom feature generation is difficult to use otherwise, because the integration
> code must deal itself with setting up the feature generators. In some cases the user even does not have control
> over the code, or does not want to change it, e.g. in the UIMA wrappers.
> The same logic should be used for the POS Tagger and Chunker.
> The issues is migrated from SourceForge:
> https://sourceforge.net/tracker/?func=detail&aid=1941380&group_id=3368&atid=353368

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira