You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by tw...@apache.org on 2009/07/31 11:44:25 UTC
svn commit: r799561 - in /incubator/uima/site/trunk/uima-website:
docs/sandbox.html xdocs/sandbox.xml
Author: twgoetz
Date: Fri Jul 31 09:44:25 2009
New Revision: 799561
URL: http://svn.apache.org/viewvc?rev=799561&view=rev
Log:
UIMA-1469: add Lucas info to website
Modified:
incubator/uima/site/trunk/uima-website/docs/sandbox.html
incubator/uima/site/trunk/uima-website/xdocs/sandbox.xml
Modified: incubator/uima/site/trunk/uima-website/docs/sandbox.html
URL: http://svn.apache.org/viewvc/incubator/uima/site/trunk/uima-website/docs/sandbox.html?rev=799561&r1=799560&r2=799561&view=diff
==============================================================================
--- incubator/uima/site/trunk/uima-website/docs/sandbox.html (original)
+++ incubator/uima/site/trunk/uima-website/docs/sandbox.html Fri Jul 31 09:44:25 2009
@@ -196,7 +196,7 @@
</td></tr>
<tr><td>
<blockquote class="sectionBody">
- <h4>Annotators</h4>
+ <h4>Annotators and Consumers</h4>
<ul>
<li><a href="#whitespace.tokenizer">Whitespace Tokenizer Annotator</a></li>
<li><a href="#snowball.annotator">Snowball Annotator</a></li>
@@ -206,7 +206,8 @@
<li><a href="#bsf.annotator">BSF Annotator</a></li>
<li><a href="#opencalais.annotator">OpenCalais Annotator</a></li>
<li><a href="#concept.mapper.annotator">Concept Mapper Annotator</a></li>
- <li><a href="#tika.annotator">Tika Annotator</a></li>
+ <li><a href="#tika.annotator">Tika Annotator</a></li>
+ <li><a href="#lucas.consumer">Lucene CAS indexer (Lucas)</a></li>
</ul>
<h4>Servers</h4>
<ul>
@@ -500,13 +501,15 @@
</td></tr>
<tr><td>
<blockquote class="subsectionBody">
- <p>Apache Tika is a toolkit for detecting and extracting metadata and
-structured text content from various documents using existing parser
-libraries. The TikaAnnotator uses
-<a href="http://lucene.apache.org/tika/" target="_blank">Tika</a> to generate annotations representing
-the original markup of a document, extract its text and metadata. It
-consists of three resources:
-</p>
+ <p>
+ Apache Tika is a toolkit for detecting and extracting metadata and
+ structured text content from various documents using existing parser
+ libraries. The TikaAnnotator uses
+ <a href="http://lucene.apache.org/tika/" target="_blank">Tika</a>
+ to generate annotations representing
+ the original markup of a document, extract its text and metadata. It
+ consists of three resources:
+ </p>
<dl>
<dt>FileSystemCollectionReader</dt>
<dd>similar to the one in UIMA examples but uses
@@ -524,6 +527,34 @@
</blockquote>
</td></tr>
</table>
+ <table class="subsectionTable" id='lucas.consumer'>
+ <tr><td>
+
+
+
+ <a name="Lucene CAS indexer (Lucas)">
+ <h2>Lucene CAS indexer (Lucas)
+ </h2>
+ </a>
+ </td></tr>
+ <tr><td>
+ <blockquote class="subsectionBody">
+ <p>
+ The Lucene CAS indexer (Lucas) is a UIMA CAS consumer that stores CAS
+ data in a <a href="http://lucene.apache.org">Lucene</a> index. The consumer
+ transforms annotation objects of a CAS into Lucene token streams
+ which are stored in a Lucene document. Token streams can further be processed
+ by token filters. Lucas comes with a set of its own token filters and
+ integrations for some Lucene token filters. Furthermore, you can
+ deploy your own token filters. The mapping between UIMA annotations and Lucene
+ tokens and token filtering is configured by a xml mapping file. The
+ Java source of the consumer can be accessed in the
+ <a href="http://svn.apache.org/repos/asf/incubator/uima/sandbox/trunk/Lucas">
+ SVN repository</a>.
+ </p>
+ </blockquote>
+ </td></tr>
+ </table>
<table class="subsectionTable" id='simple-server'>
<tr><td>
Modified: incubator/uima/site/trunk/uima-website/xdocs/sandbox.xml
URL: http://svn.apache.org/viewvc/incubator/uima/site/trunk/uima-website/xdocs/sandbox.xml?rev=799561&r1=799560&r2=799561&view=diff
==============================================================================
--- incubator/uima/site/trunk/uima-website/xdocs/sandbox.xml (original)
+++ incubator/uima/site/trunk/uima-website/xdocs/sandbox.xml Fri Jul 31 09:44:25 2009
@@ -63,7 +63,7 @@
<section name="UIMA sandbox components">
- <h4>Annotators</h4>
+ <h4>Annotators and Consumers</h4>
<ul>
<li><a href="#whitespace.tokenizer">Whitespace Tokenizer Annotator</a></li>
<li><a href="#snowball.annotator">Snowball Annotator</a></li>
@@ -73,7 +73,8 @@
<li><a href="#bsf.annotator">BSF Annotator</a></li>
<li><a href="#opencalais.annotator">OpenCalais Annotator</a></li>
<li><a href="#concept.mapper.annotator">Concept Mapper Annotator</a></li>
- <li><a href="#tika.annotator">Tika Annotator</a></li>
+ <li><a href="#tika.annotator">Tika Annotator</a></li>
+ <li><a href="#lucas.consumer">Lucene CAS indexer (Lucas)</a></li>
</ul>
<h4>Servers</h4>
<ul>
@@ -236,16 +237,19 @@
<a class="external" href="http://svn.apache.org/repos/asf/incubator/uima/sandbox/trunk/BSFAnnotator">
http://svn.apache.org/repos/asf/incubator/uima/sandbox/trunk/BSFAnnotator</a>.
</p>
- </subsection>
+ </subsection>
- <subsection name='Tika Annotator' id="tika.annotator">
- <p>Apache Tika is a toolkit for detecting and extracting metadata and
-structured text content from various documents using existing parser
-libraries. The TikaAnnotator uses
-<a href="http://lucene.apache.org/tika/" target="_blank">Tika</a> to generate annotations representing
-the original markup of a document, extract its text and metadata. It
-consists of three resources:
-</p>
+ <subsection name="Tika Annotator" id="tika.annotator">
+ <p>
+ Apache Tika is a toolkit for detecting and extracting metadata and
+ structured text content from various documents using existing parser
+ libraries. The TikaAnnotator uses
+ <a href="http://lucene.apache.org/tika/" target="_blank">Tika</a>
+ to generate annotations representing
+ the original markup of a document, extract its text and metadata. It
+ consists of three resources:
+ </p>
+
<dl>
<dt>FileSystemCollectionReader</dt>
@@ -263,6 +267,22 @@
</dl>
</subsection>
+ <subsection name="Lucene CAS indexer (Lucas)" id="lucas.consumer">
+ <p>
+ The Lucene CAS indexer (Lucas) is a UIMA CAS consumer that stores CAS
+ data in a <a href="http://lucene.apache.org">Lucene</a> index. The consumer
+ transforms annotation objects of a CAS into Lucene token streams
+ which are stored in a Lucene document. Token streams can further be processed
+ by token filters. Lucas comes with a set of its own token filters and
+ integrations for some Lucene token filters. Furthermore, you can
+ deploy your own token filters. The mapping between UIMA annotations and Lucene
+ tokens and token filtering is configured by a xml mapping file. The
+ Java source of the consumer can be accessed in the
+ <a href="http://svn.apache.org/repos/asf/incubator/uima/sandbox/trunk/Lucas">
+ SVN repository</a>.
+ </p>
+ </subsection>
+
<subsection name='Simple Server (UIMA REST Service)'
id="simple-server">
<p>