You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Adam Lally <al...@alum.rpi.edu> on 2007/03/01 19:15:18 UTC

Re: Help on UIMA Analysis Engine Agreggation

On 2/28/07, LASRI YASSINE <la...@gmail.com> wrote:
> Thank you for your response, my problem is that :
> I have an external file that contains a list of persons names, for example :
>
> adam
> smith
> lary
> page
> ... etc
> and I need to extract all persons names from others source (Text Documents),
> for example :
> "Lary Page is the creator of google and Adam Smith is an economist"
> The annotator shoul extract <Adam Smith> and <Lary Page> as  person name. So
> what I can do ?
>

I'm not sure I completely understand your scenario, but is it the case
that you've already written an Annotator that creates annotations over
the individual works in the list?  So for example it would annotate
<Adam> and <Smith> as separate PersonName annotations?

If so, then I think the appraoach from my last mail would work.  In a
second annotator, iterate over all the PersonName annotations.  For
each two consecutive annotations a1 and a2, check if
documentText.substring(a1.getEnd(), a2.getBegin()) is all whitespace.
If so, create a new annotation (e.g. FullPersonName) spanning from
a1.getBegin() to a2.getEnd().

-Adam