You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by Jörn Kottmann <ko...@gmail.com> on 2011/05/17 10:25:27 UTC
Fixes that decrease existing model performance
Hi all,
I was wondering if we can do bug fixes which slightly decrease
the performance of existing models?
In this case I am speaking about OPENNLP-172 which fixes the handling
of lower case sequences in of the token class feature. It detects a
lower case sequences when they contain only A to Z, but in other languages
are more letters like the German umlauts.
This fix will decrease the recall of the existing spanish person ner
model by 2%,
should we apply it anyway for the next release?
After retraining the recall goes up by 6%.
Jörn
Re: Fixes that decrease existing model performance
Posted by Jason Baldridge <ja...@gmail.com>.
+1
On Tue, May 17, 2011 at 6:49 AM, Olivier Grisel <ol...@ensta.org>wrote:
> 2011/5/17 Jörn Kottmann <ko...@gmail.com>:
> >
> >>> After retraining the recall goes up by 6%.
> >>
> >> I am +1 for fixing bugs and providing retrained models for the next
> >> release.
> >
> > I guess it will also improve your french models after re-training.
>
> Good to know, thanks.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
--
Jason Baldridge
Assistant Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com
http://twitter.com/jasonbaldridge
Re: Fixes that decrease existing model performance
Posted by Olivier Grisel <ol...@ensta.org>.
2011/5/17 Jörn Kottmann <ko...@gmail.com>:
>
>>> After retraining the recall goes up by 6%.
>>
>> I am +1 for fixing bugs and providing retrained models for the next
>> release.
>
> I guess it will also improve your french models after re-training.
Good to know, thanks.
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Re: Fixes that decrease existing model performance
Posted by Jörn Kottmann <ko...@gmail.com>.
On 5/17/11 12:29 PM, Olivier Grisel wrote:
> 2011/5/17 Jörn Kottmann<ko...@gmail.com>:
>> Hi all,
>>
>> I was wondering if we can do bug fixes which slightly decrease
>> the performance of existing models?
>>
>> In this case I am speaking about OPENNLP-172 which fixes the handling
>> of lower case sequences in of the token class feature. It detects a
>> lower case sequences when they contain only A to Z, but in other languages
>> are more letters like the German umlauts.
>>
>> This fix will decrease the recall of the existing spanish person ner model
>> by 2%,
>> should we apply it anyway for the next release?
>>
>> After retraining the recall goes up by 6%.
> I am +1 for fixing bugs and providing retrained models for the next release.
I guess it will also improve your french models after re-training.
Jörn
Re: Fixes that decrease existing model performance
Posted by Olivier Grisel <ol...@ensta.org>.
2011/5/17 Jörn Kottmann <ko...@gmail.com>:
> Hi all,
>
> I was wondering if we can do bug fixes which slightly decrease
> the performance of existing models?
>
> In this case I am speaking about OPENNLP-172 which fixes the handling
> of lower case sequences in of the token class feature. It detects a
> lower case sequences when they contain only A to Z, but in other languages
> are more letters like the German umlauts.
>
> This fix will decrease the recall of the existing spanish person ner model
> by 2%,
> should we apply it anyway for the next release?
>
> After retraining the recall goes up by 6%.
I am +1 for fixing bugs and providing retrained models for the next release.
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel