You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Jörn Kottmann (JIRA)" <ji...@apache.org> on 2011/05/02 14:40:03 UTC

[jira] [Commented] (OPENNLP-17) Add support for custom feature generator configuration embedded in the model package

    [ https://issues.apache.org/jira/browse/OPENNLP-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027636#comment-13027636 ] 

Jörn Kottmann commented on OPENNLP-17:
--------------------------------------

I quickly investigated how security for solution 3. could be implemented. It seems that AccessController.doPrivileged could be used to execute the script code without any permissions, but only when the application programmer using OpenNLP has set the SecurityManager of the VM. Running the script code in a "sandbox" will prevent most attacks, but some for example denial-of-service attacks would still be possible, but are not so dangerous.

As a worst case something a this could easily happen:
1. User downloads some test model from the internet
2. Writes short sample code to test the model on some input data (does not installs a SecurityManager)
3. Runs the test code
4. Script codes does something malicious e.g. deletes files

To work around the security problem described above OpenNLP could install a Security Manager in case there is none.
Another issue of the Security Manager is that it might slow down feature generation a bit and that the user should have an option to disable it e.g. when he know he only runs trusted models.

> Add support for custom feature generator configuration embedded in the model package
> ------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-17
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-17
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Chunker, Name Finder, POS Tagger
>    Affects Versions: tools-1.5.0-sourceforge
>            Reporter: Jörn Kottmann
>
> Add support for custom feature generator configuration embedded in the model package.
> The configuration of the feature generators for the name finder component can be quite complex and the configuration must
> be always done twice once for training and once for tagging. Doing it twice at two different points in time makes
> the feature generation very error prone. Small mistakes lead to a drop in detection performance which might
> be difficult to notice. 
> To solve this issue add the configuration to the model, then it must only be specified during training and
> can be loaded from the model during tagging.
> Another advantage is that custom feature generation is difficult to use otherwise, because the integration
> code must deal itself with setting up the feature generators. In some cases the user even does not have control
> over the code, or does not want to change it, e.g. in the UIMA wrappers.
> The same logic should be used for the POS Tagger and Chunker.
> The issues is migrated from SourceForge:
> https://sourceforge.net/tracker/?func=detail&aid=1941380&group_id=3368&atid=353368

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira