You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Ekaterina Buyko <ek...@uni-jena.de> on 2009/06/22 17:03:16 UTC

Regular expressions over UIMA annotations

Hi,

I am interested in using JAPE grammar or something with similar
functionality in UIMA. Has anybody already experience in that?

Thank you

Ekaterina

-- 
Ekaterina Buyko
Jena University Language and Information Engineering (JULIE) Lab
Phone: +49-3641-944303
Fax:   +49-3641-944321
email: ekaterina.buyko@uni-jena.de
URL:   http://www.julielab.de


RE: Regular expressions over UIMA annotations

Posted by jo...@thomsonreuters.com.
Hi Ekaterina,

Good to meet you at NAACL.
    You can invoke JAPE grammars inside UIMA/Java - here are some pointers on GATE-UIMA
interoperability:
* http://gate.ac.uk/sale/talks/gate-course-oct06/uima-integration.ppt
* http://videolectures.net/gate06_roberts_giuil/

Best regards,
Jochen



--
Dr. Jochen Leidner
Research Scientist

Thomson Reuters 
Corporate Technology
Research & Development
610 Opperman Drive
St. Paul, MN 55123
USA

http://www.ThomsonReuters.com

-----Original Message-----
From: Ekaterina Buyko [mailto:ekaterina.buyko@uni-jena.de] 
Sent: Monday, June 22, 2009 10:03 AM
To: uima-user@incubator.apache.org
Subject: Regular expressions over UIMA annotations

Hi,

I am interested in using JAPE grammar or something with similar
functionality in UIMA. Has anybody already experience in that?

Thank you

Ekaterina

-- 
Ekaterina Buyko
Jena University Language and Information Engineering (JULIE) Lab
Phone: +49-3641-944303
Fax:   +49-3641-944321
email: ekaterina.buyko@uni-jena.de
URL:   http://www.julielab.de



Re: Regular expressions over UIMA annotations

Posted by Peter Klügl <pk...@ki.informatik.uni-wuerzburg.de>.
Hi,

we are currently developing a system for rule-based information 
extraction (available at https://sourceforge.net/projects/textmarker/). 
The project is still in an early state. However, you can find a short 
description at http://tmwiki.informatik.uni-wuerzburg.de/, if you are 
interested.

Peter

Ekaterina Buyko schrieb:
> Hi,
>
> I am interested in using JAPE grammar or something with similar
> functionality in UIMA. Has anybody already experience in that?
>
> Thank you
>
> Ekaterina
>
>   
-- 

Peter Klügl
pkluegl@uni-wuerzburg.de


Re: Regular expressions over UIMA annotations

Posted by Martin Gerlach <Ma...@neofonie.de>.
Hi Ekaterina and replyers,

we migrated the bridge written by the GATE team to Apache UIMA 2.2.2 and
find it shows pretty good performance although there are indeed a lot of
objects being created. It is also possible to map rich UIMA type systems
to GATE annotations if you run some extra JAPE phases with Java code to
convert the FSArray structures etc. to GATE annotations. However, we
found that mapping rich GATE annotations back to UIMA is somewhat
difficult as the mapping is not very powerful. You might need to do some
extra conversion in an AE following the GATE AE.

We're planning to donate our code back to GATE or even to Apache at some
point but have to clarify legal issues first, which may take some time.

However, everything you need is go by the presentation pointed out by
Jochen before:

http://gate.ac.uk/sale/talks/gate-course-oct06/uima-integration.ppt

(unfortunately gate.ac.uk seems to be currently down)

To migrate the GATE bridge from IBM UIMA to Apache, start with changing
the package names. It's almost all you need to do.

Let me know if you decide to go that way and run into problems - my
colleague who did the main work on this and I can then try to assist you.

Regards,
Martin

Roberto Franchini schrieb:
> On Mon, Jun 22, 2009 at 5:03 PM, Ekaterina Buyko <
> ekaterina.buyko@uni-jena.de> wrote:
> 
>> Hi,
>>
>> I am interested in using JAPE grammar or something with similar
>> functionality in UIMA. Has anybody already experience in that?
>>
>> Thank you
>>
>>
> We developed a bridge to use JAPE inside a UIMA pipeline.
> It works, but I'm not very happy with it:
> - low performance: mapping from uima to jape and then back to uima
> genersates a lot of objects and a lot of GC cycles: if you are going to
> analyze a lot of documents you can:
> -- buy new HW
> -- study the voodo to optimize  the JVM performance :)
> - difficult to map a rich UIMA type sytem  to jape annotaions
> 
> If you want a can share my very-ugly code :)
> 
> The GATE team wrote a bridge but as far as I know it support the old
> ibm-uima.
> Text marker seems very interesting to me, I will take a deep look to it on
> August .
> cheers,
> R.
> 

-- 
--------------------------------
Martin Gerlach
Softwareentwicklung

neofonie
Technologieentwicklung und
Informationsmanagement GmbH
Robert-Koch-Platz 4
10115 Berlin
fon: +49.30 24627 413
fax: +49.30 24627 120
Martin.Gerlach@neofonie.de
http://www.neofonie.de

Handelsregister
Berlin-Charlottenburg: HRB 67460

Geschaeftsfuehrung
Helmut Hoffer von Ankershoffen
(Sprecher der Geschaeftsfuehrung)
Nurhan Yildirim
-------------------------------

WeFind - Genau was Du suchst

Die erste Web 2.0 Suchmaschine jetzt auf http://www.wefind.de.
Unterwegs immer bestens informiert mit WeFind Mobile für iPhone und
jetzt auch mit WeFind Mobile für Android: kostenloser Download im iTunes
AppStore und im Android Market.

Re: Regular expressions over UIMA annotations

Posted by Nicolas Hernandez <ni...@gmail.com>.
Hi everyone,

Does someone have any experience with the Semantic Search of IBM ?
Language expressivity of the request ? Turn it into an AE ?
Performance ? Use cases ?...

http://www.alphaworks.ibm.com/tech/uima
June 13, 2008 UIMA SDK has been moved to Apache as Incubator
OpenSource project. Two Apache UIMA components (SemanticSearch 2.1 for
Apache UIMA and IBM UIMA wrapper) are still available here.

Even if it is made for the Apache version, it does not seem to
progress anylonger.

Best


On Tue, Jun 23, 2009 at 12:34 AM, Roberto
Franchini<ro...@gmail.com> wrote:
> On Mon, Jun 22, 2009 at 5:03 PM, Ekaterina Buyko <
> ekaterina.buyko@uni-jena.de> wrote:
>
>> Hi,
>>
>> I am interested in using JAPE grammar or something with similar
>> functionality in UIMA. Has anybody already experience in that?
>>
>> Thank you
>>
>>
> We developed a bridge to use JAPE inside a UIMA pipeline.
> It works, but I'm not very happy with it:
> - low performance: mapping from uima to jape and then back to uima
> genersates a lot of objects and a lot of GC cycles: if you are going to
> analyze a lot of documents you can:
> -- buy new HW
> -- study the voodo to optimize  the JVM performance :)
> - difficult to map a rich UIMA type sytem  to jape annotaions
>
> If you want a can share my very-ugly code :)
>
> The GATE team wrote a bridge but as far as I know it support the old
> ibm-uima.
> Text marker seems very interesting to me, I will take a deep look to it on
> August .
> cheers,
> R.
>
> --
> Roberto Franchini
> http://www.celi.it
> http://www.blogmeter.it
> http://www.memesphere.it
> Tel +39-011-6600814
> jabber:ro.franchini@gmail.com
> <ja...@gmail.com>skype:ro.franchini
>



-- 
Nicolas.Hernandez@univ-nantes.fr
--
http://www.sciences.univ-nantes.fr/info/perso/permanents/hernandez
# Laboratoire LINA-TALN CNRS UMR 6241
tel. +33 (0)2 51 12 58 55
# Institut Universitaire de Technologie de Nantes - Département Informatique
tel. +33 (0)2 40 30 60 67

Re: Regular expressions over UIMA annotations

Posted by "D.J. McCloskey" <dj...@ie.ibm.com>.
Hi Roberto and Ekaterina,

Have you taken a look at IBM LanguageWare on Alphaworks
(http://www.alphaworks.ibm.com/tech/lrw)?.
We've just released a new version 7.1.1.3 so its a good time to check it
out.
It is a very powerful UIMA based platform for creating Information
Extraction applications. I'd love to hear your opinion.

You can check more info in links in my footer. Or contact us through the
Alphaworks forum or the email address in the FAQ.

Regards,
-DJ
-------------------
D.J McCloskey
IBM LanguageWare Architect
Email: dj_mccloskey@ie.ibm.com   |  Phone: +353-1-8153647 (work)
+353-86-8157637 (mobile)

http://www.ibm.com/software/ebusiness/jstart/infocenters/languageware.html
http://www-306.ibm.com/software/globalization/topics/languageware/index.jsp
http://www.alphaworks.ibm.com/tech/lrw
http://en.wikipedia.org/wiki/Languageware

IBM Ireland Product Distribution Limited registered in Ireland with number
92815.  Registered office: Oldbrook House, 24-32 Pembroke Road,
Ballsbridge, Dublin 4


|------------>
| From:      |
|------------>
  >------------------------------------------------------------------------------------------------------------------------------------------------|
  |Roberto Franchini <ro...@gmail.com>                                                                                                      |
  >------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  >------------------------------------------------------------------------------------------------------------------------------------------------|
  |uima-user@incubator.apache.org                                                                                                                  |
  >------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  >------------------------------------------------------------------------------------------------------------------------------------------------|
  |22/06/2009 23:35                                                                                                                                |
  >------------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  >------------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: Regular expressions over UIMA annotations                                                                                                   |
  >------------------------------------------------------------------------------------------------------------------------------------------------|





On Mon, Jun 22, 2009 at 5:03 PM, Ekaterina Buyko <
ekaterina.buyko@uni-jena.de> wrote:

> Hi,
>
> I am interested in using JAPE grammar or something with similar
> functionality in UIMA. Has anybody already experience in that?
>
> Thank you
>
>
We developed a bridge to use JAPE inside a UIMA pipeline.
It works, but I'm not very happy with it:
- low performance: mapping from uima to jape and then back to uima
genersates a lot of objects and a lot of GC cycles: if you are going to
analyze a lot of documents you can:
-- buy new HW
-- study the voodo to optimize  the JVM performance :)
- difficult to map a rich UIMA type sytem  to jape annotaions

If you want a can share my very-ugly code :)

The GATE team wrote a bridge but as far as I know it support the old
ibm-uima.
Text marker seems very interesting to me, I will take a deep look to it on
August .
cheers,
R.

--
Roberto Franchini
http://www.celi.it
http://www.blogmeter.it
http://www.memesphere.it
Tel +39-011-6600814
jabber:ro.franchini@gmail.com
<ja...@gmail.com>skype:ro.franchini




Re: Regular expressions over UIMA annotations

Posted by Roberto Franchini <ro...@gmail.com>.
On Mon, Jun 22, 2009 at 5:03 PM, Ekaterina Buyko <
ekaterina.buyko@uni-jena.de> wrote:

> Hi,
>
> I am interested in using JAPE grammar or something with similar
> functionality in UIMA. Has anybody already experience in that?
>
> Thank you
>
>
We developed a bridge to use JAPE inside a UIMA pipeline.
It works, but I'm not very happy with it:
- low performance: mapping from uima to jape and then back to uima
genersates a lot of objects and a lot of GC cycles: if you are going to
analyze a lot of documents you can:
-- buy new HW
-- study the voodo to optimize  the JVM performance :)
- difficult to map a rich UIMA type sytem  to jape annotaions

If you want a can share my very-ugly code :)

The GATE team wrote a bridge but as far as I know it support the old
ibm-uima.
Text marker seems very interesting to me, I will take a deep look to it on
August .
cheers,
R.

-- 
Roberto Franchini
http://www.celi.it
http://www.blogmeter.it
http://www.memesphere.it
Tel +39-011-6600814
jabber:ro.franchini@gmail.com
<ja...@gmail.com>skype:ro.franchini