You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Debbie Zhang <de...@gmail.com> on 2014/04/29 11:01:19 UTC

Lemmatization in UIMA

Hi,

 

I would like to know how to do lemmatization in UIMA. Does any library or
annotator can do the job?

 

In normal Java program, I can use wordnet or Stanford NLP lemmatizer.
However, I have trouble to read wordnet dictionary files in a UIMA
annotator. The model jar file also doesn't work in a UIMA annotator.

 

Any suggestion?   Thank you.

 

Regards,

 

Debbie


RE: Lemmatization in UIMA

Posted by Debbie Zhang <de...@gmail.com>.
Thanks Petr! It would be good if I can get the wordnet working in UIMA. I
modified your code and it works when I run it on Eclipse. However, it
doesn't work when I run it on CAS Visual Debugger or Document Analyzer with
the following error: net.didion.jwnl.JWNLException: Unable to install
net.didion.jwnl.dictionary.FileBackedDictionary

I put wordnet data files to the resources fold as I will deploy the PEAR
file to another system once it works. 

You can download my Java project file if you want to have a look:
https://db.tt/nQDm24pr

Thanks again for your help!

Regards,

Debbie

> -----Original Message-----
> From: Petr Baudis [mailto:pasky@ucw.cz]
> Sent: Tuesday, 29 April 2014 9:20 PM
> To: user@uima.apache.org
> Subject: Re: Lemmatization in UIMA
> 
>   Hi!
> 
> On Tue, Apr 29, 2014 at 07:01:19PM +1000, Debbie Zhang wrote:
> > I would like to know how to do lemmatization in UIMA. Does any
> library or
> > annotator can do the job?
> 
>   Sure; I recommend to take a look at the DKPro-core project which
> provides UIMA wrappers for many NLP tools, including various
> lemmatizers:
> 
> 	http://code.google.com/p/dkpro-core-asl/
> 
> > In normal Java program, I can use wordnet or Stanford NLP lemmatizer.
> > However, I have trouble to read wordnet dictionary files in a UIMA
> > annotator. The model jar file also doesn't work in a UIMA annotator.
> 
>   What kind of trouble are you having with using Wordnet in UIMA
> annotators? It works fine for me:
> 
> 	https://github.com/brmson/yodaqa/blob/master/src/main/java/cz/brm
> lab/yodaqa/analysis/tycor/LATByWordnet.java#L61
> 	https://github.com/brmson/yodaqa/blob/master/src/main/java/cz/brm
> lab/yodaqa/provider/JWordnet.java
> 	https://github.com/brmson/yodaqa/blob/master/src/main/resources/c
> z/brmlab/yodaqa/provider/wordnet.xml
> 
> (Note that in retrospect, I'd go for the JWI library instead of
> JWordnet.)
> 
> 				Petr "Pasky" Baudis


Re: Lemmatization in UIMA

Posted by Petr Baudis <pa...@ucw.cz>.
  Hi!

On Tue, Apr 29, 2014 at 07:01:19PM +1000, Debbie Zhang wrote:
> I would like to know how to do lemmatization in UIMA. Does any library or
> annotator can do the job?

  Sure; I recommend to take a look at the DKPro-core project which
provides UIMA wrappers for many NLP tools, including various
lemmatizers:

	http://code.google.com/p/dkpro-core-asl/

> In normal Java program, I can use wordnet or Stanford NLP lemmatizer.
> However, I have trouble to read wordnet dictionary files in a UIMA
> annotator. The model jar file also doesn't work in a UIMA annotator.

  What kind of trouble are you having with using Wordnet in UIMA
annotators? It works fine for me:

	https://github.com/brmson/yodaqa/blob/master/src/main/java/cz/brmlab/yodaqa/analysis/tycor/LATByWordnet.java#L61
	https://github.com/brmson/yodaqa/blob/master/src/main/java/cz/brmlab/yodaqa/provider/JWordnet.java
	https://github.com/brmson/yodaqa/blob/master/src/main/resources/cz/brmlab/yodaqa/provider/wordnet.xml

(Note that in retrospect, I'd go for the JWI library instead of
JWordnet.)

				Petr "Pasky" Baudis