You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by "J. Fiala" <ma...@fwd.at> on 2018/09/26 14:06:17 UTC
Model NameFinder de
Hi there,
I saw there is no model for Name Finder for language german.
Would you be interested to have on based on tiger or is someone else
already working on that?
I could not find an issue for adding models to NameFinder in other
languages, should I create a new one?
Thanks & Best regards,
Johannes
Re: Model NameFinder de
Posted by "J. Fiala" <ma...@fwd.at>.
Hi there,
I added a model for NameFinder (de) based on Tiger treebank 2.2 and
attached it to the issue.
For details see https://issues.apache.org/jira/browse/OPENNLP-1223.
I first extracted 6.271 sentences mentioning names and trained based on
that (filtered) data. Or is it better to use the complete training data
(including the sentences without names)?
Best regards,
Johannes
Am 28.09.2018 um 09:28 schrieb Joern Kottmann:
> Hello,
>
> we can only distribute artifacts at Apache which can be licensed under
> the AL 2.0.
>
> I am not sure what the situation withe the tiger corpus is, but it
> might have a clause in its license which would restrict this.
>
> Anyway, +1 to release a model trained on the tiger corpus, and to add
> support to train on it.
>
> Jörn
> On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
>> Hi there,
>>
>> I saw there is no model for Name Finder for language german.
>>
>> Would you be interested to have on based on tiger or is someone else
>> already working on that?
>>
>> I could not find an issue for adding models to NameFinder in other
>> languages, should I create a new one?
>>
>> Thanks & Best regards,
>> Johannes
>>
>>
Re: Model NameFinder de
Posted by "J. Fiala" <ma...@fwd.at>.
Hi there,
Thank you for the (late) response!
I'm not sure, but isn't it a task of the actual user of the corpus to
ask for a license if he/she wants to use the corpus itself, not any
models derived from it?
http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/license/index.html
======
License
1. Research and evaluation purposes
For research and evaluation purposes, the TIGERCorpus can be downloaded
for free. However, we ask you to acknowledge the TIGERCorpus license
agreement for non-commercial use. The "Accept license terms" button at
the bottom of the license will then take you to the download page.
2. Commercial purposes
If you are interested in a commercial license of the TIGERCorpus, please
contact the secretary of Prof. Hans Uszkoreit's chair at Saarland
University at sek-hu AT coli DOT uni-saarland DOT de.
======
If you are in doubt, I can send an email and ask what the Saarland
University thinks about it?
Best regards,
Johannes
Am 11.12.2018 um 11:59 schrieb Richard Eckart de Castilho:
> On 10. Dec 2018, at 15:30, Joern Kottmann <ko...@gmail.com> wrote:
>> sorry, for the late reply here. We can only release artifacts under AL 2.0.
>> Yes, we would need to check this on a case by case basis. What is the
>> license the tiger corpus is distributed under?
> http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/license/htmlicense.html
>
> Cheers,
>
> -- Richard
Re: Model NameFinder de
Posted by Richard Eckart de Castilho <re...@apache.org>.
On 10. Dec 2018, at 15:30, Joern Kottmann <ko...@gmail.com> wrote:
>
> sorry, for the late reply here. We can only release artifacts under AL 2.0.
> Yes, we would need to check this on a case by case basis. What is the
> license the tiger corpus is distributed under?
http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/license/htmlicense.html
Cheers,
-- Richard
Re: Model NameFinder de
Posted by Joern Kottmann <ko...@gmail.com>.
Hello,
sorry, for the late reply here. We can only release artifacts under AL 2.0.
Yes, we would need to check this on a case by case basis. What is the
license the tiger corpus is distributed under?
Jörn
On Fri, Sep 28, 2018 at 10:08 AM J. Fiala <ma...@fwd.at> wrote:
>
> Hi there,
>
> Thx for your response.
>
> IMHO it should be no problem to release models based on tiger as it
> seems the provided models for german are already based on tiger
> (http://opennlp.sourceforge.net/models-1.5/).
> Or do I have to organize an agreement from the Universität Stuttgart to
> be able to release a trained model for namefinder?
>
> Best regards,
> Johannes
>
>
> Am 28.09.2018 um 09:28 schrieb Joern Kottmann:
> > Hello,
> >
> > we can only distribute artifacts at Apache which can be licensed under
> > the AL 2.0.
> >
> > I am not sure what the situation withe the tiger corpus is, but it
> > might have a clause in its license which would restrict this.
> >
> > Anyway, +1 to release a model trained on the tiger corpus, and to add
> > support to train on it.
> >
> > Jörn
> > On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
> >> Hi there,
> >>
> >> I saw there is no model for Name Finder for language german.
> >>
> >> Would you be interested to have on based on tiger or is someone else
> >> already working on that?
> >>
> >> I could not find an issue for adding models to NameFinder in other
> >> languages, should I create a new one?
> >>
> >> Thanks & Best regards,
> >> Johannes
> >>
> >>
>
Re: Model NameFinder de
Posted by "J. Fiala" <ma...@fwd.at>.
Hi there,
Thx for your response.
IMHO it should be no problem to release models based on tiger as it
seems the provided models for german are already based on tiger
(http://opennlp.sourceforge.net/models-1.5/).
Or do I have to organize an agreement from the Universität Stuttgart to
be able to release a trained model for namefinder?
Best regards,
Johannes
Am 28.09.2018 um 09:28 schrieb Joern Kottmann:
> Hello,
>
> we can only distribute artifacts at Apache which can be licensed under
> the AL 2.0.
>
> I am not sure what the situation withe the tiger corpus is, but it
> might have a clause in its license which would restrict this.
>
> Anyway, +1 to release a model trained on the tiger corpus, and to add
> support to train on it.
>
> Jörn
> On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
>> Hi there,
>>
>> I saw there is no model for Name Finder for language german.
>>
>> Would you be interested to have on based on tiger or is someone else
>> already working on that?
>>
>> I could not find an issue for adding models to NameFinder in other
>> languages, should I create a new one?
>>
>> Thanks & Best regards,
>> Johannes
>>
>>
Re: Model NameFinder de
Posted by "J. Fiala" <ma...@fwd.at>.
Dear Daniel,
I updated https://issues.apache.org/jira/browse/OPENNLP-1223 and added
the updated model now trained on all of the tiger data (50.472 sentences).
Evaluation is done only on sentences containing names (6.271 sentences).
For restrictions see "Further improvements" in the issue.
Best regards,
Johannes
Am 09.10.2018 um 14:48 schrieb J. Fiala:
> Dear Daniel,
>
> Sure, so all data = training basis, only data with persons =
> evaluation basis.
>
> However, it seems it is not possible to supply the evaluation data,
> only the model. But I can supply the evaluation results as a QA basis.
> We can also take an evaluation run on other treebanks like Hamburg
> Dependecy treebank (and spot some errors there).
>
> Are there already any utilitiy routines for extracting the data from
> tiger etc. or should I supply some in Java (besides using python based
> nltk)?
>
> I didn't see any tiger-based nltk / Java handling routines in the docs.
>
> Best regards,
> Johannes
>
>
> Am 09.10.2018 um 14:38 schrieb Dan Russ:
>>> Hi Jörn
>>> Is it possible to train on all of the tiger data. but test on
>>> universal
>>> dependencies?
>>>
>> Daniel
>> On Fri, Sep 28, 2018, 3:28 AM Joern Kottmann <ko...@gmail.com> wrote:
>>>> Hello,
>>>>
>>>> we can only distribute artifacts at Apache which can be licensed under
>>>> the AL 2.0.
>>>>
>>>> I am not sure what the situation withe the tiger corpus is, but it
>>>> might have a clause in its license which would restrict this.
>>>>
>>>> Anyway, +1 to release a model trained on the tiger corpus, and to add
>>>> support to train on it.
>>>>
>>>> Jörn
>>>> On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at>
>>>> wrote:
>>>>> Hi there,
>>>>>
>>>>> I saw there is no model for Name Finder for language german.
>>>>>
>>>>> Would you be interested to have on based on tiger or is someone else
>>>>> already working on that?
>>>>>
>>>>> I could not find an issue for adding models to NameFinder in other
>>>>> languages, should I create a new one?
>>>>>
>>>>> Thanks & Best regards,
>>>>> Johannes
>>>>>
>>>>>
>
>
Re: Model NameFinder de
Posted by "J. Fiala" <ma...@fwd.at>.
Dear Daniel,
Sure, so all data = training basis, only data with persons = evaluation
basis.
However, it seems it is not possible to supply the evaluation data, only
the model. But I can supply the evaluation results as a QA basis.
We can also take an evaluation run on other treebanks like Hamburg
Dependecy treebank (and spot some errors there).
Are there already any utilitiy routines for extracting the data from
tiger etc. or should I supply some in Java (besides using python based
nltk)?
I didn't see any tiger-based nltk / Java handling routines in the docs.
Best regards,
Johannes
Am 09.10.2018 um 14:38 schrieb Dan Russ:
>> Hi Jörn
>> Is it possible to train on all of the tiger data. but test on universal
>> dependencies?
>>
> Daniel
> On Fri, Sep 28, 2018, 3:28 AM Joern Kottmann <ko...@gmail.com> wrote:
>>> Hello,
>>>
>>> we can only distribute artifacts at Apache which can be licensed under
>>> the AL 2.0.
>>>
>>> I am not sure what the situation withe the tiger corpus is, but it
>>> might have a clause in its license which would restrict this.
>>>
>>> Anyway, +1 to release a model trained on the tiger corpus, and to add
>>> support to train on it.
>>>
>>> Jörn
>>> On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
>>>> Hi there,
>>>>
>>>> I saw there is no model for Name Finder for language german.
>>>>
>>>> Would you be interested to have on based on tiger or is someone else
>>>> already working on that?
>>>>
>>>> I could not find an issue for adding models to NameFinder in other
>>>> languages, should I create a new one?
>>>>
>>>> Thanks & Best regards,
>>>> Johannes
>>>>
>>>>
Re: Model NameFinder de
Posted by Dan Russ <da...@gmail.com>.
> Hi Jörn
> Is it possible to train on all of the tiger data. but test on universal
> dependencies?
>
Daniel
>
On Fri, Sep 28, 2018, 3:28 AM Joern Kottmann <ko...@gmail.com> wrote:
>
>> Hello,
>>
>> we can only distribute artifacts at Apache which can be licensed under
>> the AL 2.0.
>>
>> I am not sure what the situation withe the tiger corpus is, but it
>> might have a clause in its license which would restrict this.
>>
>> Anyway, +1 to release a model trained on the tiger corpus, and to add
>> support to train on it.
>>
>> Jörn
>> On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
>> >
>> > Hi there,
>> >
>> > I saw there is no model for Name Finder for language german.
>> >
>> > Would you be interested to have on based on tiger or is someone else
>> > already working on that?
>> >
>> > I could not find an issue for adding models to NameFinder in other
>> > languages, should I create a new one?
>> >
>> > Thanks & Best regards,
>> > Johannes
>> >
>> >
>>
>
Re: Model NameFinder de
Posted by Joern Kottmann <ko...@gmail.com>.
Hello,
we can only distribute artifacts at Apache which can be licensed under
the AL 2.0.
I am not sure what the situation withe the tiger corpus is, but it
might have a clause in its license which would restrict this.
Anyway, +1 to release a model trained on the tiger corpus, and to add
support to train on it.
Jörn
On Wed, Sep 26, 2018 at 4:06 PM J. Fiala <ma...@fwd.at> wrote:
>
> Hi there,
>
> I saw there is no model for Name Finder for language german.
>
> Would you be interested to have on based on tiger or is someone else
> already working on that?
>
> I could not find an issue for adding models to NameFinder in other
> languages, should I create a new one?
>
> Thanks & Best regards,
> Johannes
>
>