You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Markus Jelsma <ma...@openindex.io> on 2020/11/11 13:05:46 UTC

TokenNameFinderTrainer using POS tags as features

Hello,

There are two feature generators dealing with POS tags available, POSTaggerNameFeatureGenerator and PosTaggerFeatureGenerator. I know how to enable them in my features XML configuration but what do they do, and what makes them different? Should we use both, only one, or none at all?

Additionally, how to feed the POS-tags in the training data? Normal data looks like:
He learned the <START:language> Polish language <END> at home .

How would this look using POS tags? Like this?
He_PRON learned_VERB the_DET <START:language> Polish_ADJ language_NOUN <END> at_ADP home_NOUN ._PUNCT

Thanks!
Markus

Re: TokenNameFinderTrainer using POS tags as features

Posted by William Colen <wi...@gmail.com>.
Hello, Markus,

Did you solve the issue?
The `POSTaggerNameFeatureGenerator` requires a pre-trained POS Tagger
model. It will annotate the tokens, before both during training and runtime.

To train the name finder you will use the same corpus format as you usually
train any name finder model, no need to add POS tags.

Regards,
William


Em qua., 11 de nov. de 2020 às 10:06, Markus Jelsma <
markus.jelsma@openindex.io> escreveu:

> Hello,
>
> There are two feature generators dealing with POS tags available,
> POSTaggerNameFeatureGenerator and PosTaggerFeatureGenerator. I know how to
> enable them in my features XML configuration but what do they do, and what
> makes them different? Should we use both, only one, or none at all?
>
> Additionally, how to feed the POS-tags in the training data? Normal data
> looks like:
> He learned the <START:language> Polish language <END> at home .
>
> How would this look using POS tags? Like this?
> He_PRON learned_VERB the_DET <START:language> Polish_ADJ language_NOUN
> <END> at_ADP home_NOUN ._PUNCT
>
> Thanks!
> Markus
>