You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Srinivas Yerram <sr...@motivitylabs.com> on 2014/11/22 18:33:46 UTC
UIMA framework annotators multiple languages support clarifications
Dear Sir / Madam,
My core use cases are related to email data parsing, which are in different templates and in different languages. Which I need to extract useful information through UIMA annotators or any other plugin components. Scalability and clustering is high priority in my use case.
I would like to get clarification on apache UIMA framework as mentioned in below:
Whether UIMA framework annotators or any plug-in components will support for multi-language(like English,French,Arabic,Chinese etc) to parse the email contents ?
Whether can I integrate Stanford NLP libraries can be used as a plugin for apache UIMA framework components ?
I will appreciate for any quick response on this. Thanks
Regards,
Srinivas Yerram
Re: UIMA framework annotators multiple languages support clarifications
Posted by Richard Eckart de Castilho <re...@apache.org>.
On 23.11.2014, at 14:27, Srinivas Yerram <sr...@motivitylabs.com> wrote:
> Thank you so much Richard for your response.
>
> I have couple of question on apache UIMA features as ....
>
> 1.Does apache UIMA support for multi languages(non English like Chinese,French,Arabic etc..) text analysis ?
Apache UIMA is language agnostic. It depends on the actual components if and how multi-language support is realized.
E.g. the DKPro Core components look at what language a document is in and then try to automatically
load a suitable model for processing the document.
> 2.Does have any official documentation to integrate Stanford NLP with apache UIMA framework ?
If non-Apache counts also as "official", then maybe this one:
https://code.google.com/p/dkpro-core-asl/wiki/StanfordCoreComponents
> 3. Does have any official documentation info on openNLP with apache UIMA framework ?
http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#org.apche.opennlp.uima
Cheers,
-- Richard
RE: UIMA framework annotators multiple languages support
clarifications
Posted by Srinivas Yerram <sr...@motivitylabs.com>.
Thank you so much Richard for your response.
I have couple of question on apache UIMA features as ....
1.Does apache UIMA support for multi languages(non English like Chinese,French,Arabic etc..) text analysis ?
2.Does have any official documentation to integrate Stanford NLP with apache UIMA framework ?
3. Does have any official documentation info on openNLP with apache UIMA framework ?
I appreciate for any response on above clarifications.
Thanks & Regards,
Srinivas Yerram
-----Original Message-----
From: Richard Eckart de Castilho [mailto:rec@apache.org]
Sent: 23 November 2014 04:28
To: dev@uima.apache.org
Subject: Re: UIMA framework annotators multiple languages support clarifications
Hi,
UIMA is a framework that enables unstructured content analysis - it does in general not provide it.
UIMA component collections provide analysis components. Some do include wrappers for third-party NLP tools such as the Stanford NLP tools.
These are some collections I know (not exhaustive):
ClearTK - http://cleartk.googlecode.com
cTAKES - http://ctakes.apache.org/
DKPro Core - http://code.google.com/p/dkpro-core-asl/
OpenNLP UIMA components - http://opennlp.apache.org
UIMA Addons & Sandbox - http://uima.apache.org/sandbox.html
I should note that I'm involved with the DKPro Core collection.
Cheers,
-- Richard
On 22.11.2014, at 18:33, Srinivas Yerram <sr...@motivitylabs.com> wrote:
>
> Dear Sir / Madam,
>
> My core use cases are related to email data parsing, which are in different templates and in different languages. Which I need to extract useful information through UIMA annotators or any other plugin components. Scalability and clustering is high priority in my use case.
>
> I would like to get clarification on apache UIMA framework as mentioned in below:
>
> Whether UIMA framework annotators or any plug-in components will support for multi-language(like English,French,Arabic,Chinese etc) to parse the email contents ?
>
>
> Whether can I integrate Stanford NLP libraries can be used as a plugin for apache UIMA framework components ?
>
>
> I will appreciate for any quick response on this. Thanks
>
> Regards,
> Srinivas Yerram
Re: UIMA framework annotators multiple languages support clarifications
Posted by Richard Eckart de Castilho <re...@apache.org>.
Hi,
UIMA is a framework that enables unstructured content analysis - it does in general not provide it.
UIMA component collections provide analysis components. Some do include
wrappers for third-party NLP tools such as the Stanford NLP tools.
These are some collections I know (not exhaustive):
ClearTK - http://cleartk.googlecode.com
cTAKES - http://ctakes.apache.org/
DKPro Core - http://code.google.com/p/dkpro-core-asl/
OpenNLP UIMA components - http://opennlp.apache.org
UIMA Addons & Sandbox - http://uima.apache.org/sandbox.html
I should note that I'm involved with the DKPro Core collection.
Cheers,
-- Richard
On 22.11.2014, at 18:33, Srinivas Yerram <sr...@motivitylabs.com> wrote:
>
> Dear Sir / Madam,
>
> My core use cases are related to email data parsing, which are in different templates and in different languages. Which I need to extract useful information through UIMA annotators or any other plugin components. Scalability and clustering is high priority in my use case.
>
> I would like to get clarification on apache UIMA framework as mentioned in below:
>
> Whether UIMA framework annotators or any plug-in components will support for multi-language(like English,French,Arabic,Chinese etc) to parse the email contents ?
>
>
> Whether can I integrate Stanford NLP libraries can be used as a plugin for apache UIMA framework components ?
>
>
> I will appreciate for any quick response on this. Thanks
>
> Regards,
> Srinivas Yerram