You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Rico Landefeld <Ri...@uni-jena.de> on 2009/01/21 17:53:08 UTC

Lucas- Lucene CAS Indexer release

Dear UIMA Developers ans Users,

The JULIE Lab is happy to announce the release of Lucas 0.5 - a UIMA CAS 
consumer component which writes CAS data into a Lucene index.

At the heart for the user is a flexible XML-based "mapping configuration
file" in which the user can determine which UIMA annotations should be
put into which Lucene field, and how this field is set up (e.g.
indexed and/or stored). In addition, some basic functionality for
(ontolgical) hypernym indexing is provided.

Additionally, Lucas is able to perform offset-based token stream
alignment and merging of UIMA annotations (via token position increment) 
in the same Lucene field (e.g. "documenttext" or "title").

Lucas, along with the documentation, sources and a sample mapping
file, is available at:
https://www.coling.uni-jena.de/sites/lucas

Since this is a project which tries to bridge two Apache projects (UIMA
and Lucene), we would like to submit it to
the UIMA Sandbox, in order to solicit further development by the UIMA
community.
What steps do we have to take in order to start this process? As far as
we know, the sandbox candidate has to undergo
a voting process on uima-dev list.

Please test the component and report any bugs or suggestions for 
improvement back to us.

Best regards,
Rico Landefeld
Joachim Wermter