You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Jeremy Villalobos <je...@gmail.com> on 2012/04/14 10:34:37 UTC

UIMA Annotator Library

Hello:

I have studied the User's Guide on creating Annotators, testing them and
deploying them with a CPE.

I wonder if there are UIMA annotator Libraries out on the Web for common
annotation such as names, locations, phone numbers etc.

Thanks

Jeremy

Re: UIMA Annotator Library

Posted by Jörn Kottmann <ko...@gmail.com>.
Yes, you can use Apache OpenNLP.
Its a very well written library for standard NLP tasks.

See here:
opennlp.apache.org

Hope this helps,
Jörn

On 04/14/2012 10:34 AM, Jeremy Villalobos wrote:
> Hello:
>
> I have studied the User's Guide on creating Annotators, testing them and
> deploying them with a CPE.
>
> I wonder if there are UIMA annotator Libraries out on the Web for common
> annotation such as names, locations, phone numbers etc.
>
> Thanks
>
> Jeremy
>


RE: UIMA Annotator Library

Posted by Torsten Zesch <ze...@ukp.informatik.tu-darmstadt.de>.
You could also have a look at DKPro Core
http://code.google.com/p/dkpro-core-asl/

-Torsten

> -----Original Message-----
> From: Jeremy Villalobos [mailto:jeremyvillalobos@gmail.com]
> Sent: Saturday, April 14, 2012 10:35 AM
> To: user@uima.apache.org
> Subject: UIMA Annotator Library
> 
> Hello:
> 
> I have studied the User's Guide on creating Annotators, testing them and
> deploying them with a CPE.
> 
> I wonder if there are UIMA annotator Libraries out on the Web for common
> annotation such as names, locations, phone numbers etc.
> 
> Thanks
> 
> Jeremy

Re: UIMA Annotator Library

Posted by Richard Eckart de Castilho <ec...@ukp.informatik.tu-darmstadt.de>.
Hello Jeremy,

> I wonder if there are UIMA annotator Libraries out on the Web for common
> annotation such as names, locations, phone numbers etc.

For tasks like extracting namens and locations, a Named Entity Detection component can be used. There are several that I know of:

- AlchemyAPIAnnotator in the UIMA sandbox seems to provide NER annotations (uses an online service)
  http://uima.apache.org/sandbox.html#alchemy.annotator

- ClearTK has a NER module
  http://code.google.com/p/cleartk/source/browse/#svn%2Ftrunk%2Fcleartk-named-entity

- DKPro Core GPL provides an integration of the Stanford NER tools
  http://code.google.com/p/dkpro-core-gpl/source/browse/de.tudarmstadt.ukp.dkpro.core-gpl/trunk/de.tudarmstadt.ukp.dkpro.core.stanfordnlp/src/test/java/de/tudarmstadt/ukp/dkpro/core/stanfordnlp/StanfordNamedEntityRecognizerTest.java

- OpenCalais Annotator in the UIMA sandbox seems to provide NER annotations (uses an online service) 
  http://uima.apache.org/sandbox.html#opencalais.annotator

- OpenNLP has a NER engine
  https://svn.apache.org/repos/asf/opennlp/trunk/opennlp-uima/src/main/java/opennlp/uima/namefind/

If you do not need to be as sophisticated, you could use 

- the Dictionary Annotator from the UIMA sandbox
  http://uima.apache.org/sandbox.html#dict.annotator

- or maybe the Concept Mapper Annotator from the UIMA sandbox
  http://uima.apache.org/sandbox.html#concept.mapper.annotator

For extracting phone numbers, etc. you'll probably want something that can match regular expressions. You probably should check out

- the Regular Expression Annotator in the UIMA sandbox
  http://uima.apache.org/sandbox.html#regex.annotator

- the TextMarker engine that has recently been contributed to Apache UIMA Sanbox
  http://www.is.informatik.uni-wuerzburg.de/forschung/anwendungen/textmarker/

Cheers,

-- Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
eckart@ukp.informatik.tu-darmstadt.de 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
-------------------------------------------------------------------