You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@opennlp.apache.org by Damiano Porta <da...@gmail.com> on 2017/02/04 10:16:10 UTC

Proper way to extract name/surname from PERSON entity

Hello everybody,

I have trained my NER (maxent) model and fortunately i have a good PERSON
accuracy.

My problem is when i need to split/extract the name and the surname from
the person entity.

What way can i follow to do this step? I thought about a classifier that
tell me the class of each word inside the person entity OR using a
dictionary. I do not know.

I have around 5k names and 150k surnames dictionaries, but how to deal with
disambiguation? should i order by their frequencies? I mean that i should
give them a weight to understand their name/surname probabilities.

Thank you for your opinion

Damiano