You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Sonia Gomez <go...@hotmail.com> on 2012/01/20 10:33:55 UTC
setting extraction sensitivity
Hello
I want setting extraction sensitivity of entity, i have in my text this french words ".....avec les chantiers ....." and Stanbol extract this entity " Person : Les Paul"
can i add the stop word or setting the extraction sensitivity ?
Thanks for your helps
Re: setting extraction sensitivity
Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Sonia
Since revision 1228163 the NER engine uses the language as extracted by the langid engine to check if a NER model for that language is available. If no model is available it does not process the text.
Have a look at
https://issues.apache.org/jira/browse/STANBOL-102
for details
best
Rupert
On 20.01.2012, at 11:34, Olivier Grisel wrote:
> 2012/1/20 Sonia Gomez <go...@hotmail.com>:
>>
>> Hello
>> I want setting extraction sensitivity of entity, i have in my text this french words ".....avec les chantiers ....." and Stanbol extract this entity " Person : Les Paul"
>> can i add the stop word or setting the extraction sensitivity ?
>
> The default NER engine in Stanbol
> (NamedEntityExtractionEnhancementEngine) does not work correctly on
> non-English content. You should probably disable that engine on French
> content.
>
> Building statistic OpenNLP model for French is not a easy task to
> solve (although it's possible and deserve some investing time in it).
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
Re: setting extraction sensitivity
Posted by Olivier Grisel <ol...@ensta.org>.
2012/1/20 Sonia Gomez <go...@hotmail.com>:
>
> Hello
> I want setting extraction sensitivity of entity, i have in my text this french words ".....avec les chantiers ....." and Stanbol extract this entity " Person : Les Paul"
> can i add the stop word or setting the extraction sensitivity ?
The default NER engine in Stanbol
(NamedEntityExtractionEnhancementEngine) does not work correctly on
non-English content. You should probably disable that engine on French
content.
Building statistic OpenNLP model for French is not a easy task to
solve (although it's possible and deserve some investing time in it).
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel