You are viewing a plain text version of this content. The canonical link for it is here.
Posted to legal-discuss@apache.org by "Joern Kottmann (JIRA)" <ji...@apache.org> on 2017/05/19 11:12:05 UTC

[jira] [Created] (LEGAL-309) Apache OpenNLP wants to release models trained on Universal Dependency under AL 2.0

Joern Kottmann created LEGAL-309:
------------------------------------

             Summary: Apache OpenNLP wants to release models trained on Universal Dependency under AL 2.0
                 Key: LEGAL-309
                 URL: https://issues.apache.org/jira/browse/LEGAL-309
             Project: Legal Discuss
          Issue Type: Question
            Reporter: Joern Kottmann


he OpenNLP project develops statistical natural language processing software which needs to be trained in order to produce a model that can be used to perform one of our supported tasks such as part-of-speech tagging or lemmatization.

We would like to know if it would be possible to train models on data included in UD which itself is licensed under various licenses and then release the trained models under AL 2.0.

If you go to [1] you can see a list of data files and their license.

Here is a list of the licenses:
CC BY 4.0
CC BY SA 4.0
CC BY-NC-SA 2.5, 3.0, 4.0 and without version
CC BY-NC-SA US 3.0
CC BY-SA 4.0 
GPL
LGPLLR

The models we would like to train on that data are:
- Part-of-Speech models (contains bigrams and a set of individual words of the training text)
- Lemmatizer (contains a set of individual words of the training text)

As far as we understand individual words or very short phrases extracted from a corpus are not protected by its original copyright. The above licenses as far as we know don't forbid to derive statistics from its content. 

[1] http://universaldependencies.org/

 




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: legal-discuss-unsubscribe@apache.org
For additional commands, e-mail: legal-discuss-help@apache.org