You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by "J. Fiala" <ma...@fwd.at> on 2018/09/26 14:06:17 UTC

Model NameFinder de

Hi there,

I saw there is no model for Name Finder for language german.

Would you be interested to have on based on tiger or is someone else 
already working on that?

I could not find an issue for adding models to NameFinder in other 
languages, should I create a new one?

Thanks & Best regards,
Johannes



Re: Model NameFinder de

Posted by "J. Fiala" <ma...@fwd.at>.
Hi there,

I added a model for NameFinder (de) based on Tiger treebank 2.2 and 
attached it to the issue.

For details see https://issues.apache.org/jira/browse/OPENNLP-1223.

I first extracted 6.271 sentences mentioning names and trained based on 
that (filtered) data. Or is it better to use the complete training data 
(including the sentences without names)?

Best regards,

Johannes


Am 28.09.2018 um 09:28 schrieb Joern Kottmann:
> Hello,
>
> we can only distribute artifacts at Apache which can be licensed under
> the AL 2.0.
>
> I am not sure what the situation withe the tiger corpus is, but it
> might have a clause in its license which would restrict this.
>
> Anyway, +1 to release a model trained on the tiger corpus, and to add
> support to train on it.
>
> Jörn
> On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
>> Hi there,
>>
>> I saw there is no model for Name Finder for language german.
>>
>> Would you be interested to have on based on tiger or is someone else
>> already working on that?
>>
>> I could not find an issue for adding models to NameFinder in other
>> languages, should I create a new one?
>>
>> Thanks & Best regards,
>> Johannes
>>
>>


Re: Model NameFinder de

Posted by "J. Fiala" <ma...@fwd.at>.
Hi there,

Thank you for the (late) response!

I'm not sure, but isn't it a task of the actual user of the corpus to 
ask for a license if he/she wants to use the corpus itself, not any 
models derived from it?

http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/license/index.html

======

License
1. Research and evaluation purposes

For research and evaluation purposes, the TIGERCorpus can be downloaded 
for free. However, we ask you to acknowledge the TIGERCorpus license 
agreement for non-commercial use. The "Accept license terms" button at 
the bottom of the license will then take you to the download page.

2. Commercial purposes

If you are interested in a commercial license of the TIGERCorpus, please 
contact the secretary of Prof. Hans Uszkoreit's chair at Saarland 
University at sek-hu AT coli DOT uni-saarland DOT de.

======

If you are in doubt, I can send an email and ask what the Saarland 
University thinks about it?

Best regards,
Johannes

Am 11.12.2018 um 11:59 schrieb Richard Eckart de Castilho:
> On 10. Dec 2018, at 15:30, Joern Kottmann <ko...@gmail.com> wrote:
>> sorry, for the late reply here. We can only release artifacts under AL 2.0.
>> Yes, we would need to check this on a case by case basis. What is the
>> license the tiger corpus is distributed under?
> http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/license/htmlicense.html
>
> Cheers,
>
> -- Richard

Re: Model NameFinder de

Posted by Richard Eckart de Castilho <re...@apache.org>.
On 10. Dec 2018, at 15:30, Joern Kottmann <ko...@gmail.com> wrote:
> 
> sorry, for the late reply here. We can only release artifacts under AL 2.0.
> Yes, we would need to check this on a case by case basis. What is the
> license the tiger corpus is distributed under?

http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/license/htmlicense.html

Cheers,

-- Richard

Re: Model NameFinder de

Posted by Joern Kottmann <ko...@gmail.com>.
Hello,

sorry, for the late reply here. We can only release artifacts under AL 2.0.
Yes, we would need to check this on a case by case basis. What is the
license the tiger corpus is distributed under?

Jörn

On Fri, Sep 28, 2018 at 10:08 AM J. Fiala <ma...@fwd.at> wrote:
>
> Hi there,
>
> Thx for your response.
>
> IMHO it should be no problem to release models based on tiger as it
> seems the provided models for german are already based on tiger
> (http://opennlp.sourceforge.net/models-1.5/).
> Or do I have to organize an agreement from the Universität Stuttgart to
> be able to release a trained model for namefinder?
>
> Best regards,
> Johannes
>
>
> Am 28.09.2018 um 09:28 schrieb Joern Kottmann:
> > Hello,
> >
> > we can only distribute artifacts at Apache which can be licensed under
> > the AL 2.0.
> >
> > I am not sure what the situation withe the tiger corpus is, but it
> > might have a clause in its license which would restrict this.
> >
> > Anyway, +1 to release a model trained on the tiger corpus, and to add
> > support to train on it.
> >
> > Jörn
> > On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
> >> Hi there,
> >>
> >> I saw there is no model for Name Finder for language german.
> >>
> >> Would you be interested to have on based on tiger or is someone else
> >> already working on that?
> >>
> >> I could not find an issue for adding models to NameFinder in other
> >> languages, should I create a new one?
> >>
> >> Thanks & Best regards,
> >> Johannes
> >>
> >>
>

Re: Model NameFinder de

Posted by "J. Fiala" <ma...@fwd.at>.
Hi there,

Thx for your response.

IMHO it should be no problem to release models based on tiger as it 
seems the provided models for german are already based on tiger 
(http://opennlp.sourceforge.net/models-1.5/).
Or do I have to organize an agreement from the Universität Stuttgart to 
be able to release a trained model for namefinder?

Best regards,
Johannes


Am 28.09.2018 um 09:28 schrieb Joern Kottmann:
> Hello,
>
> we can only distribute artifacts at Apache which can be licensed under
> the AL 2.0.
>
> I am not sure what the situation withe the tiger corpus is, but it
> might have a clause in its license which would restrict this.
>
> Anyway, +1 to release a model trained on the tiger corpus, and to add
> support to train on it.
>
> Jörn
> On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
>> Hi there,
>>
>> I saw there is no model for Name Finder for language german.
>>
>> Would you be interested to have on based on tiger or is someone else
>> already working on that?
>>
>> I could not find an issue for adding models to NameFinder in other
>> languages, should I create a new one?
>>
>> Thanks & Best regards,
>> Johannes
>>
>>


Re: Model NameFinder de

Posted by "J. Fiala" <ma...@fwd.at>.
Dear Daniel,

I updated https://issues.apache.org/jira/browse/OPENNLP-1223 and added 
the updated model now trained on all of the tiger data (50.472 sentences).
Evaluation is done only on sentences containing names (6.271 sentences).

For restrictions see "Further improvements" in the issue.

Best regards,
Johannes


Am 09.10.2018 um 14:48 schrieb J. Fiala:
> Dear Daniel,
>
> Sure, so all data = training basis, only data with persons = 
> evaluation basis.
>
> However, it seems it is not possible to supply the evaluation data, 
> only the model. But I can supply the evaluation results as a QA basis.
> We can also take an evaluation run on other treebanks like Hamburg 
> Dependecy treebank (and spot some errors there).
>
> Are there already any utilitiy routines for extracting the data from 
> tiger etc. or should I supply some in Java (besides using python based 
> nltk)?
>
> I didn't see any tiger-based nltk / Java handling routines in the docs.
>
> Best regards,
> Johannes
>
>
> Am 09.10.2018 um 14:38 schrieb Dan Russ:
>>> Hi Jörn
>>>    Is it possible to train on all of the tiger data. but test on 
>>> universal
>>> dependencies?
>>>
>> Daniel
>> On Fri, Sep 28, 2018, 3:28 AM Joern Kottmann <ko...@gmail.com> wrote:
>>>> Hello,
>>>>
>>>> we can only distribute artifacts at Apache which can be licensed under
>>>> the AL 2.0.
>>>>
>>>> I am not sure what the situation withe the tiger corpus is, but it
>>>> might have a clause in its license which would restrict this.
>>>>
>>>> Anyway, +1 to release a model trained on the tiger corpus, and to add
>>>> support to train on it.
>>>>
>>>> Jörn
>>>> On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> 
>>>> wrote:
>>>>> Hi there,
>>>>>
>>>>> I saw there is no model for Name Finder for language german.
>>>>>
>>>>> Would you be interested to have on based on tiger or is someone else
>>>>> already working on that?
>>>>>
>>>>> I could not find an issue for adding models to NameFinder in other
>>>>> languages, should I create a new one?
>>>>>
>>>>> Thanks & Best regards,
>>>>> Johannes
>>>>>
>>>>>
>
>


Re: Model NameFinder de

Posted by "J. Fiala" <ma...@fwd.at>.
Dear Daniel,

Sure, so all data = training basis, only data with persons = evaluation 
basis.

However, it seems it is not possible to supply the evaluation data, only 
the model. But I can supply the evaluation results as a QA basis.
We can also take an evaluation run on other treebanks like Hamburg 
Dependecy treebank (and spot some errors there).

Are there already any utilitiy routines for extracting the data from 
tiger etc. or should I supply some in Java (besides using python based 
nltk)?

I didn't see any tiger-based nltk / Java handling routines in the docs.

Best regards,
Johannes


Am 09.10.2018 um 14:38 schrieb Dan Russ:
>> Hi Jörn
>>    Is it possible to train on all of the tiger data. but test on universal
>> dependencies?
>>
> Daniel
> On Fri, Sep 28, 2018, 3:28 AM Joern Kottmann <ko...@gmail.com> wrote:
>>> Hello,
>>>
>>> we can only distribute artifacts at Apache which can be licensed under
>>> the AL 2.0.
>>>
>>> I am not sure what the situation withe the tiger corpus is, but it
>>> might have a clause in its license which would restrict this.
>>>
>>> Anyway, +1 to release a model trained on the tiger corpus, and to add
>>> support to train on it.
>>>
>>> Jörn
>>> On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
>>>> Hi there,
>>>>
>>>> I saw there is no model for Name Finder for language german.
>>>>
>>>> Would you be interested to have on based on tiger or is someone else
>>>> already working on that?
>>>>
>>>> I could not find an issue for adding models to NameFinder in other
>>>> languages, should I create a new one?
>>>>
>>>> Thanks & Best regards,
>>>> Johannes
>>>>
>>>>


Re: Model NameFinder de

Posted by Dan Russ <da...@gmail.com>.
> Hi Jörn
>   Is it possible to train on all of the tiger data. but test on universal
> dependencies?
>
Daniel
>

On Fri, Sep 28, 2018, 3:28 AM Joern Kottmann <ko...@gmail.com> wrote:
>
>> Hello,
>>
>> we can only distribute artifacts at Apache which can be licensed under
>> the AL 2.0.
>>
>> I am not sure what the situation withe the tiger corpus is, but it
>> might have a clause in its license which would restrict this.
>>
>> Anyway, +1 to release a model trained on the tiger corpus, and to add
>> support to train on it.
>>
>> Jörn
>> On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
>> >
>> > Hi there,
>> >
>> > I saw there is no model for Name Finder for language german.
>> >
>> > Would you be interested to have on based on tiger or is someone else
>> > already working on that?
>> >
>> > I could not find an issue for adding models to NameFinder in other
>> > languages, should I create a new one?
>> >
>> > Thanks & Best regards,
>> > Johannes
>> >
>> >
>>
>

Re: Model NameFinder de

Posted by Joern Kottmann <ko...@gmail.com>.
Hello,

we can only distribute artifacts at Apache which can be licensed under
the AL 2.0.

I am not sure what the situation withe the tiger corpus is, but it
might have a clause in its license which would restrict this.

Anyway, +1 to release a model trained on the tiger corpus, and to add
support to train on it.

Jörn
On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
>
> Hi there,
>
> I saw there is no model for Name Finder for language german.
>
> Would you be interested to have on based on tiger or is someone else
> already working on that?
>
> I could not find an issue for adding models to NameFinder in other
> languages, should I create a new one?
>
> Thanks & Best regards,
> Johannes
>
>