You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by bu...@apache.org on 2012/11/23 14:11:32 UTC
svn commit: r839311 - in /websites/staging/stanbol/trunk/content: ./ docs/trunk/components/enhancer/nlp/

Author: buildbot
Date: Fri Nov 23 13:11:31 2012
New Revision: 839311

Log:
Staging update by buildbot for stanbol

Added:
    websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/
    websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/analyzedtext.html
    websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/index.html
    websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/inmemoryanalyzedtextimpl.html
    websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/nlpannotations
Modified:
    websites/staging/stanbol/trunk/content/   (props changed)

Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Nov 23 13:11:31 2012
@@ -1 +1 @@
-1412862
+1412870

Added: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/analyzedtext.html
==============================================================================
--- websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/analyzedtext.html (added)
+++ websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/analyzedtext.html Fri Nov 23 13:11:31 2012
@@ -0,0 +1,249 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE- 2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+  <link href="/css/stanbol.css" rel="stylesheet" type="text/css">
+  <title>Apache Stanbol - AnalysedText</title>
+  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+  <link rel="icon" type="image/png" href="/images/stanbol-logo/stanbol-favicon.png"/>
+  <script type="text/javascript">
+    // Google Analytics Tracking Code
+    var _gaq = _gaq || [];
+    _gaq.push(['_setAccount', 'UA-32086816-1']);
+    _gaq.push(['_trackPageview']);
+
+    (function() {
+      var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+      ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+      var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+    })();
+  </script>  
+</head>
+
+<body>
+  <div id="logo"> <!-- do not scroll the logo -->
+  <a href="/index.html"><img alt="Apache Stanbol" width="220" height="101" border="0" src="/images/stanbol-logo/stanbol-2010-12-14.png"/></a></div>
+  <div id="navigation"> <!-- but auto scroll the menue -->
+      <h1 id="stanbol">Stanbol</h1>
+<ul>
+<li><a href="/index.html">Home</a></li>
+<li><a href="/docs/trunk/tutorial.html">Getting Started</a></li>
+<li><a href="/docs/trunk/">Documentation</a><ul>
+<li><a href="/docs/trunk/scenarios.html">Usage Scenarios</a></li>
+<li><a href="/docs/trunk/components">Components</a></li>
+</ul>
+</li>
+<li><a href="/development/">Development</a></li>
+<li><a href="/production/">Production</a></li>
+</ul>
+<h1 id="project">Project</h1>
+<ul>
+<li><a href="/docs/trunk/mailinglists.html">Mailing Lists</a></li>
+<li><a href="https://issues.apache.org/jira/browse/STANBOL">Issue Tracker</a></li>
+<li><a href="/team.html">Project Team</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
+<li><a href="/privacy-policy.html">Privacy Policy</a></li>
+</ul>
+<h1 id="downloads">Downloads</h1>
+<ul>
+<li><a href="/downloads/">Overview</a><ul>
+<li><a href="/downloads/releases.html">Releases</a></li>
+<li><a href="/downloads/launchers.html">Launchers</a></li>
+</ul>
+</li>
+</ul>
+<h1 id="archive">Archive</h1>
+<ul>
+<li><a href="/docs/0.9.0-incubating/">0.9.0-incubating</a></li>
+</ul>
+<h1 id="the-asf">The ASF</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a Sponsor</a></li>
+<li><a href="http://www.apache.org/security/">Security</a></li>
+</ul>
+  </div>
+  <div id="content">
+    <div class="breadcrumbs">
+      <ul> <li><a href="/">Home</a></li> <li class="item"><a href="/docs/">Docs</a></li> <li class="item"><a href="/docs/trunk/">Trunk</a></li> <li class="item"><a href="/docs/trunk/components/">Components</a></li> <li class="item"><a href="/docs/trunk/components/enhancer/">Enhancer</a></li> <li class="item"><a href="/docs/trunk/components/enhancer/nlp/">Nlp</a></li> </ul>
+    </div>
+    <h1 class="title">AnalysedText</h1>
+    <p>The AnalysedText is a Java domain model designed to describe NLP processing results. It describes of two major parts:</p>
+<ol>
+<li>Structure of the Text such as text-sections, sentences, chunks and tokens</li>
+<li>Annotations for the detected parts of the text.</li>
+</ol>
+<h2 id="analysettext-as-contentpart">AnalysetText as ContentPart</h2>
+<p>Within the Stanbol Enhancer the AnalysedText is used as <a href="../contentitem#content-parts">ContentPart</a> registered with the URI <code>urn:stanbol.enhancer:nlp.analysedText</code></p>
+<p>Because of that it can be retrieved by using the following code</p>
+<div class="codehilite"><pre><span class="n">AnalysedText</span> <span class="n">at</span><span class="o">;</span>
+<span class="n">ci</span><span class="o">.</span><span class="na">getLock</span><span class="o">().</span><span class="na">readLock</span><span class="o">().</span><span class="na">lock</span><span class="o">();</span>
+<span class="k">try</span> <span class="o">{</span>
+    <span class="n">at</span> <span class="o">=</span> <span class="n">ci</span><span class="o">.</span><span class="na">getPart</span><span class="o">(</span><span class="n">AnalysedText</span><span class="o">.</span><span class="na">ANALYSED_TEXT_URI</span><span class="o">,</span> <span class="n">AnalysedText</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
+<span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="n">NoSuchPartException</span> <span class="n">e</span><span class="o">)</span> <span class="o">{</span>
+    <span class="c1">//not present</span>
+    <span class="n">at</span> <span class="o">=</span> <span class="kc">null</span><span class="o">;</span>
+<span class="o">}</span> <span class="k">finally</span> <span class="o">{</span>
+    <span class="n">ci</span><span class="o">.</span><span class="na">getLock</span><span class="o">().</span><span class="na">readLock</span><span class="o">().</span><span class="na">unlock</span><span class="o">();</span>
+<span class="o">}</span>
+</pre></div>
+
+
+<p>Components that need to create an AnalysedText instance can do so by using the <em>AnalysedTextFactory</em></p>
+<div class="codehilite"><pre><span class="nd">@Reference</span>
+<span class="n">AnalysedTextFactory</span> <span class="n">atf</span><span class="o">;</span>
+
+<span class="n">ContentItem</span> <span class="n">ci</span><span class="o">;</span> <span class="c1">//the contentItem</span>
+<span class="n">AnalysedText</span> <span class="n">at</span><span class="o">;</span>
+<span class="n">Entry</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span><span class="n">Blob</span><span class="o">&gt;</span> <span class="n">plainTextBlob</span> <span class="o">=</span> <span class="n">ContentItemHelper</span><span class="o">.</span><span class="na">getBlob</span><span class="o">(</span>
+    <span class="n">ci</span><span class="o">,</span> <span class="n">Collections</span><span class="o">.</span><span class="na">singelton</span><span class="o">(</span><span class="s">&quot;text/plain&quot;</span><span class="o">));</span>
+<span class="k">if</span><span class="o">(</span><span class="n">plainTextBlob</span> <span class="o">!=</span> <span class="kc">null</span><span class="o">){</span>
+    <span class="c1">//creates and adds the AnalysedText ContentPart to the ContentItem</span>
+    <span class="n">ci</span><span class="o">.</span><span class="na">getLock</span><span class="o">().</span><span class="na">writeLock</span><span class="o">().</span><span class="na">lock</span><span class="o">();</span>
+    <span class="k">try</span> <span class="o">{</span>
+        <span class="n">at</span> <span class="o">=</span> <span class="n">atf</span><span class="o">.</span><span class="na">createAnalysedText</span><span class="o">(</span><span class="n">ci</span><span class="o">,</span><span class="n">plainTextBlob</span><span class="o">.</span><span class="na">value</span><span class="o">());</span>
+    <span class="o">}</span> <span class="k">finally</span> <span class="o">{</span>
+        <span class="n">ci</span><span class="o">.</span><span class="na">getLock</span><span class="o">().</span><span class="na">writeLock</span><span class="o">().</span><span class="na">unlock</span><span class="o">();</span>
+    <span class="o">}</span>
+<span class="o">}</span> <span class="k">else</span> <span class="o">{</span> <span class="c1">//no NLP processing possible</span>
+    <span class="n">at</span> <span class="o">=</span> <span class="kc">null</span><span class="o">;</span>
+<span class="o">}</span>
+</pre></div>
+
+
+<p>If used outside of OSGI users can also use the AnalysedTextFactory#getDefaultInstance() to obtain the AnalysedTextFactory instance of the in-memory implementation.</p>
+<h2 id="structure-of-the-text">Structure of the Text</h2>
+<p>The basic building block of the AnalysedText is the Span. A Span defines type, [start,end) as well as the spanText. For the type an enumeration (<em>SpanTypeEnum</em>) with the members Text, TextSection, Sentence, Chunk and Text. [start,end) define the character positions of the Span within the Text where the start position is inclusive and the end position is exclusive.</p>
+<p>Analog to the type of the Span there are also Java interfaces representing those types and providing additional convenience methods. An additional <em>Section</em> interface was introduced as common parent for all types that may have enclosed Spans. The AnalyzedText is the interface representing SpanTypeEnum#Text. The main intension of those Java classes are to have convenience methods that ease the use of the API.</p>
+<h3 id="uniqueness-of-spans">Uniqueness of Spans</h3>
+<p>A Span is considered equals to an other Span if [start, end) and type are the same. The natural oder of Spans is defined by</p>
+<ul>
+<li>smaller start index first</li>
+<li>bigger end index first</li>
+<li>higher ordinal number of the SpanTypeEnum first</li>
+</ul>
+<p>This order is used by all Iterators returned by the AnalyzedText API</p>
+<h3 id="concurrent-modifications-and-iterators">Concurrent Modifications and Iterators</h3>
+<p>Iterators returned by the AnalyzedText API MUST throw _ConcurrentModificationException_s but rather reflect changes to the underlaying model. While this is not constant with the default behavior of Iterators in Java this is central for the effective usage of the AnalyzedText API - e.g. when Iterating over Sentences while adding Tokens.</p>
+<h3 id="code-samples">Code Samples:</h3>
+<p>The following Code Snippet shows some typical usages of the API:</p>
+<div class="codehilite"><pre><span class="n">AnalysedText</span> <span class="n">at</span><span class="o">;</span> <span class="c1">//typically retrieved from the contentPart</span>
+<span class="n">Iterator</span><span class="o">&lt;</span><span class="n">Sentence</span><span class="o">&gt;</span> <span class="n">sentences</span> <span class="o">=</span> <span class="n">at</span><span class="o">.</span><span class="na">getSentences</span><span class="o">;</span>
+<span class="k">while</span><span class="o">(</span><span class="n">sentences</span><span class="o">.</span><span class="na">hasNext</span><span class="o">){</span>
+    <span class="n">Sentence</span> <span class="n">sentence</span> <span class="o">=</span> <span class="n">sentences</span><span class="o">.</span><span class="na">next</span><span class="o">();</span>
+    <span class="n">String</span> <span class="n">sentText</span> <span class="o">=</span> <span class="n">sentence</span><span class="o">.</span><span class="na">getSpan</span><span class="o">();</span>
+    <span class="n">Iterator</span><span class="o">&lt;</span><span class="n">SentenceToken</span><span class="o">&gt;</span> <span class="n">tokens</span> <span class="o">=</span> <span class="n">sentence</span><span class="o">.</span><span class="na">getTokens</span><span class="o">();</span>
+    <span class="k">while</span><span class="o">(</span><span class="n">tokens</span><span class="o">.</span><span class="na">hasNext</span><span class="o">()){</span>
+        <span class="n">Token</span> <span class="n">token</span> <span class="o">=</span> <span class="n">tokens</span><span class="o">.</span><span class="na">next</span><span class="o">();</span>
+        <span class="n">String</span> <span class="n">tokenText</span> <span class="o">=</span> <span class="n">token</span><span class="o">.</span><span class="na">getSpan</span><span class="o">();</span>
+        <span class="n">Value</span><span class="o">&lt;</span><span class="n">PosTag</span><span class="o">&gt;</span> <span class="n">pos</span> <span class="o">=</span> <span class="n">token</span><span class="o">.</span><span class="na">getAnnotation</span><span class="o">(</span>
+            <span class="n">NlpAnnotations</span><span class="o">.</span><span class="na">posAnnotation</span><span class="o">);</span>
+        <span class="n">String</span> <span class="n">tag</span> <span class="o">=</span> <span class="n">pos</span><span class="o">.</span><span class="na">value</span><span class="o">().</span><span class="na">getTag</span><span class="o">();</span>
+        <span class="kt">double</span> <span class="n">confidence</span> <span class="o">=</span> <span class="n">pos</span><span class="o">.</span><span class="na">probability</span><span class="o">();</span>
+    <span class="o">}</span>
+<span class="o">}</span>
+</pre></div>
+
+
+<p>Code that adds new Spans looks like follows</p>
+<div class="codehilite"><pre><span class="c1">//Tokenize an Text</span>
+<span class="n">Iterator</span><span class="o">&lt;</span><span class="n">Sentence</span><span class="o">&gt;</span> <span class="n">sentences</span> <span class="o">=</span> <span class="n">at</span><span class="o">.</span><span class="na">getSentences</span><span class="o">();</span>
+<span class="n">Iterator</span><span class="o">&lt;?</span> <span class="kd">extends</span> <span class="n">Section</span><span class="o">&gt;</span> <span class="n">sections</span><span class="o">;</span>
+<span class="k">if</span><span class="o">(</span><span class="n">sentences</span><span class="o">.</span><span class="na">hasNext</span><span class="o">()){</span> <span class="c1">//sentence Annotations presnet</span>
+    <span class="n">sections</span> <span class="o">=</span> <span class="n">sentences</span><span class="o">;</span>
+<span class="o">}</span> <span class="k">else</span> <span class="o">{</span> <span class="c1">//if no sentences tokenize the text at once</span>
+    <span class="n">sections</span> <span class="o">=</span> <span class="n">Collections</span><span class="o">.</span><span class="na">singelton</span><span class="o">(</span><span class="n">at</span><span class="o">).</span><span class="na">iterator</span><span class="o">();</span>
+<span class="o">}</span>
+<span class="c1">//Tokenize the sections</span>
+<span class="k">for</span><span class="o">(</span><span class="n">Section</span> <span class="n">section</span> <span class="o">:</span> <span class="n">sentenceList</span><span class="o">){</span>
+    <span class="c1">//assuming the Tokenizer returns tokens as 2dim int array</span>
+    <span class="kt">int</span><span class="o">[][]</span> <span class="n">tokenSpans</span> <span class="o">=</span> <span class="n">tokenizer</span><span class="o">.</span><span class="na">tokenize</span><span class="o">(</span><span class="n">section</span><span class="o">.</span><span class="na">getSpan</span><span class="o">());</span>
+    <span class="k">for</span><span class="o">(</span><span class="kt">int</span> <span class="n">ti</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span> <span class="n">ti</span> <span class="o">&lt;</span> <span class="n">tokenSpans</span><span class="o">.</span><span class="na">length</span><span class="o">;</span> <span class="n">ti</span><span class="o">++){</span>
+        <span class="n">Token</span> <span class="n">token</span> <span class="o">=</span> <span class="n">section</span><span class="o">.</span><span class="na">addToken</span><span class="o">(</span>
+            <span class="n">tokenSpans</span><span class="o">[</span><span class="n">ti</span><span class="o">][</span><span class="mi">0</span><span class="o">],</span><span class="n">tokenSpans</span><span class="o">[</span><span class="n">ti</span><span class="o">][</span><span class="mi">1</span><span class="o">]);</span>
+    <span class="o">}</span>
+<span class="o">}</span>
+</pre></div>
+
+
+<p>For all #add<strong>(start,end) methods in the API the parsed start and end indexes are relative to the parent (the one the #add</strong>(..) method is called). The [start,end) indexes returned by Spans are absolute values. If an #add**(..) method is called for a Span '[start,end):type' that already exists than instead of an new instance the already existing one is returned.</p>
+<h2 id="annotation-support">Annotation Support</h2>
+<p>Annotation support is provided by two interfaces <em>Annotated</em> and <em>Annotation</em> and the <em>Value</em> class. <em>Annotated</em> provides an API for adding information the the annotated object. Those annotations are represented by key value mappings where Object is used as key and the <em>Value</em> class for values. The <em>Value</em> class provides the generically typed value as well as a double probability in the range [0..1] or -1 if not known. Finally the <em>Annotation</em> class is used to ensure type safety.</p>
+<p>The following example shows the intended usage of the API</p>
+<ol>
+<li>
+<p>One needs to define the <em>Annotations</em> one would like to use. Annotations are typically defined as public static members of interfaces or classes. The following example uses the definition of the Part of Speech annotation.</p>
+<p>:::java
+public interface NlpAnnotations {
+//an Part of Speech Annotation using a String key
+    //and the PosTag class as value
+    Annotation<String,PosTag> POS_ANNOTATION = new Annotation<String,PosTag>(
+        "stanbol.enhancer.nlp.pos", PosTag.class);
+...
+}</p>
+</li>
+<li>
+<p>Defined <em>Annotation</em> are used to add information to an <em>Annotated</em> instance (like a Span). For adding annotations the use of _Annotation_s is required to ensure type safety. The following code snippet shows how to add an PosTag with the probability 0.95.</p>
+<p>:::java
+PosTag tag = new PosTag("N"); //a simple POS tag
+Token token; //The Token we want to add the tag
+token.addAnnotations(POS_ANNOTATION,Value.value(tag),0.95);</p>
+</li>
+<li>
+<p>For consuming annotations there are two options. First the possibility to use the <em>Annotation</em> object and second by directly using the key. While the 2nd option is not as nicely to use (as it does not provide type safety) it allows consuming annotations without the need to have the used <em>Annotation</em> in the classpath. The following examples show both options</p>
+<p>:::java
+Iterator<Token> tokens = sentence.getTokens();
+while(tokens.hasNext){
+    Token token = tokens.next();
+    //use the POS_ANNOTATION to get the PosTag
+    PosTag tag = token.getAnnotation(POS_ANNOTATION);
+    if(tag != null){
+        log.info("{} has PosTag {}",token,tag.value());
+    } else {
+        log.infor("{} has no PosTag",token);
+    }
+    //(2) use the key to retrieve values
+    String key = "urn:test-dummy";
+    Value&lt;?&gt; value = token.getValue(key);
+    //the programmer needs to know the type!
+    if(v.probability() &gt; 0.5){
+        log.info("{}={}",key,value.value());
+    }
+}</p>
+</li>
+</ol>
+<p>The <em>Annotated</em> interface supports multi valued annotations. For that it defines methods for adding/setting and getting multiple values. Values are sorted first by the probability (unknown probability last) and secondly by the insert order (first in first out). So calling the single value getAnnotation() method on a multi valued field will return the first item (highest probability and first added in case of multiple items with the same/no probabilities)</p>
+  </div>
+  
+  <div id="footer">
+    <div class="copyright">
+      <p>
+        Copyright &copy; 2010 The Apache Software Foundation, Licensed under 
+        the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+        <br />
+        Apache, Stanbol and the Apache feather and Stanbol logos are trademarks of The Apache Software Foundation.
+      </p>
+    </div>
+  </div>
+  
+</body>
+</html>
+

Added: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/index.html
==============================================================================
--- websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/index.html (added)
+++ websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/index.html Fri Nov 23 13:11:31 2012
@@ -0,0 +1,112 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE- 2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+  <link href="/css/stanbol.css" rel="stylesheet" type="text/css">
+  <title>Apache Stanbol - NLP processing module</title>
+  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+  <link rel="icon" type="image/png" href="/images/stanbol-logo/stanbol-favicon.png"/>
+  <script type="text/javascript">
+    // Google Analytics Tracking Code
+    var _gaq = _gaq || [];
+    _gaq.push(['_setAccount', 'UA-32086816-1']);
+    _gaq.push(['_trackPageview']);
+
+    (function() {
+      var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+      ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+      var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+    })();
+  </script>  
+</head>
+
+<body>
+  <div id="logo"> <!-- do not scroll the logo -->
+  <a href="/index.html"><img alt="Apache Stanbol" width="220" height="101" border="0" src="/images/stanbol-logo/stanbol-2010-12-14.png"/></a></div>
+  <div id="navigation"> <!-- but auto scroll the menue -->
+      <h1 id="stanbol">Stanbol</h1>
+<ul>
+<li><a href="/index.html">Home</a></li>
+<li><a href="/docs/trunk/tutorial.html">Getting Started</a></li>
+<li><a href="/docs/trunk/">Documentation</a><ul>
+<li><a href="/docs/trunk/scenarios.html">Usage Scenarios</a></li>
+<li><a href="/docs/trunk/components">Components</a></li>
+</ul>
+</li>
+<li><a href="/development/">Development</a></li>
+<li><a href="/production/">Production</a></li>
+</ul>
+<h1 id="project">Project</h1>
+<ul>
+<li><a href="/docs/trunk/mailinglists.html">Mailing Lists</a></li>
+<li><a href="https://issues.apache.org/jira/browse/STANBOL">Issue Tracker</a></li>
+<li><a href="/team.html">Project Team</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
+<li><a href="/privacy-policy.html">Privacy Policy</a></li>
+</ul>
+<h1 id="downloads">Downloads</h1>
+<ul>
+<li><a href="/downloads/">Overview</a><ul>
+<li><a href="/downloads/releases.html">Releases</a></li>
+<li><a href="/downloads/launchers.html">Launchers</a></li>
+</ul>
+</li>
+</ul>
+<h1 id="archive">Archive</h1>
+<ul>
+<li><a href="/docs/0.9.0-incubating/">0.9.0-incubating</a></li>
+</ul>
+<h1 id="the-asf">The ASF</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a Sponsor</a></li>
+<li><a href="http://www.apache.org/security/">Security</a></li>
+</ul>
+  </div>
+  <div id="content">
+    <div class="breadcrumbs">
+      <ul> <li><a href="/">Home</a></li> <li class="item"><a href="/docs/">Docs</a></li> <li class="item"><a href="/docs/trunk/">Trunk</a></li> <li class="item"><a href="/docs/trunk/components/">Components</a></li> <li class="item"><a href="/docs/trunk/components/enhancer/">Enhancer</a></li> <li class="item"><a href="/docs/trunk/components/enhancer/nlp/">Nlp</a></li> </ul>
+    </div>
+    <h1 class="title">NLP processing module</h1>
+    <p>The NLP processing module for the Stanbol Enhancer was introduced by <a href="https://issues.apache.org/jira/browse/STANBOL-733">STANBOL-733</a> and is only available to Stanbol Enhancer starting from version <code>0.10.0</code></p>
+<p>It intension was to efficiently handle word level NLP processing annotations as such kind of annotations would have created to manny RDF triples to handle them in the <a href="../contentitem#metadata-of-the-contentitem">metadata of the ContentItem</a>.</p>
+<p>The Module contains of the following parts:</p>
+<ul>
+<li><strong><a href="analyzedtext">AnalyzedText</a></strong>: A data structure that represent an text in <em>Span_s like _Token_s, _Chunk_s, _Sentence_s, _TextSection_s and the _AnalyzedText</em> itself selecting the text as a whole. In addition all spans can be annotated with additional information by using the <em>Annotated</em> interface.</li>
+<li><strong><a href="nlpannotations">NLP Annotations</a></strong>: The Stanbol NLP processing module defines Ontology aligned annotation models for typical NLP processing results such as Part of Speech tagging, Phrase detection, Named Entity Recognition and full Morphological Analysis. This annotations models can than be stored to the different <em>Span_s define in the _AnalyzedText</em></li>
+</ul>
+<p>In addition the NLP processing module provides a default <a href="inmemoryanalyzedtextimpl">in-memory</a> implementation of all defined interfaces that is sufficient for all current Stanbol use cases.</p>
+  </div>
+  
+  <div id="footer">
+    <div class="copyright">
+      <p>
+        Copyright &copy; 2010 The Apache Software Foundation, Licensed under 
+        the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+        <br />
+        Apache, Stanbol and the Apache feather and Stanbol logos are trademarks of The Apache Software Foundation.
+      </p>
+    </div>
+  </div>
+  
+</body>
+</html>
+

Added: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/inmemoryanalyzedtextimpl.html
==============================================================================
--- websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/inmemoryanalyzedtextimpl.html (added)
+++ websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/inmemoryanalyzedtextimpl.html Fri Nov 23 13:11:31 2012
@@ -0,0 +1,114 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE- 2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+  <link href="/css/stanbol.css" rel="stylesheet" type="text/css">
+  <title>Apache Stanbol - In-Memory AnalyzedText and Annotation implementation</title>
+  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+  <link rel="icon" type="image/png" href="/images/stanbol-logo/stanbol-favicon.png"/>
+  <script type="text/javascript">
+    // Google Analytics Tracking Code
+    var _gaq = _gaq || [];
+    _gaq.push(['_setAccount', 'UA-32086816-1']);
+    _gaq.push(['_trackPageview']);
+
+    (function() {
+      var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+      ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+      var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+    })();
+  </script>  
+</head>
+
+<body>
+  <div id="logo"> <!-- do not scroll the logo -->
+  <a href="/index.html"><img alt="Apache Stanbol" width="220" height="101" border="0" src="/images/stanbol-logo/stanbol-2010-12-14.png"/></a></div>
+  <div id="navigation"> <!-- but auto scroll the menue -->
+      <h1 id="stanbol">Stanbol</h1>
+<ul>
+<li><a href="/index.html">Home</a></li>
+<li><a href="/docs/trunk/tutorial.html">Getting Started</a></li>
+<li><a href="/docs/trunk/">Documentation</a><ul>
+<li><a href="/docs/trunk/scenarios.html">Usage Scenarios</a></li>
+<li><a href="/docs/trunk/components">Components</a></li>
+</ul>
+</li>
+<li><a href="/development/">Development</a></li>
+<li><a href="/production/">Production</a></li>
+</ul>
+<h1 id="project">Project</h1>
+<ul>
+<li><a href="/docs/trunk/mailinglists.html">Mailing Lists</a></li>
+<li><a href="https://issues.apache.org/jira/browse/STANBOL">Issue Tracker</a></li>
+<li><a href="/team.html">Project Team</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
+<li><a href="/privacy-policy.html">Privacy Policy</a></li>
+</ul>
+<h1 id="downloads">Downloads</h1>
+<ul>
+<li><a href="/downloads/">Overview</a><ul>
+<li><a href="/downloads/releases.html">Releases</a></li>
+<li><a href="/downloads/launchers.html">Launchers</a></li>
+</ul>
+</li>
+</ul>
+<h1 id="archive">Archive</h1>
+<ul>
+<li><a href="/docs/0.9.0-incubating/">0.9.0-incubating</a></li>
+</ul>
+<h1 id="the-asf">The ASF</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a Sponsor</a></li>
+<li><a href="http://www.apache.org/security/">Security</a></li>
+</ul>
+  </div>
+  <div id="content">
+    <div class="breadcrumbs">
+      <ul> <li><a href="/">Home</a></li> <li class="item"><a href="/docs/">Docs</a></li> <li class="item"><a href="/docs/trunk/">Trunk</a></li> <li class="item"><a href="/docs/trunk/components/">Components</a></li> <li class="item"><a href="/docs/trunk/components/enhancer/">Enhancer</a></li> <li class="item"><a href="/docs/trunk/components/enhancer/nlp/">Nlp</a></li> </ul>
+    </div>
+    <h1 class="title">In-Memory AnalyzedText and Annotation implementation</h1>
+    <p>This describes the implementation of the <a href="analysedtext">Analyzed Text</a> used by default by the Stanbol NLP processing module. This implementation is directly contained within the org.apache.stanbol.enhancer.nlp module.</p>
+<h2 id="analyzedtextfactory">AnalyzedTextFactory</h2>
+<p>The AnalyzedTextFactory of the in-memory implementation registers itself as OSGI service with an "service.ranking" of Integer.MIN_VALUE. That means that any other registered AnalyzedTextFactory will override this one (unless it does not use Integer.MIN_VALUE itself).</p>
+<p>The implementation uses the ContentItemHelper#getText(Blob blob) method to retrieve the text from the parsed blob. The text is than used to create an AnalyzedText instance.</p>
+<h2 id="analyzedtext-implementation">AnalyzedText Implementation</h2>
+<p>The in-memory implementation is based on a NavigableMap that uses the same span as both key and value. TreeMap is currently used as implementation. The compareTo(..) method of the Span implementation ensures the correct ordering of Spans as specified by the <a href="analyzedtext">Analyzed Text</a> interface. All add**(..) methods first check if a span with the added type, [start,end) is already contained. If this is the case the current span is returned otherwise an new instance is created.</p>
+<p>The Iterator implementation is not based on the Iterators provided by the NavigableMap as those would throw ConcurrentModificationExceptions - what is prohibited by the specification. Instead in implementation that is based on the #higherKey() method is used. Filtered Iterators are implemented using Apache Commons Collections FilteredIterator utility with an Predicate based on the SpanTypeEnum.</p>
+<h2 id="annotation-implementation">Annotation Implementation</h2>
+<p>The implementation of the <em>Annotated</em> interface is similar to that of the SolrInputDocument. Internally it uses a Map<Object,Object> to store data. When a single value is added it is directly store in the map. In case of multiple values data are stored in Arrays. Arrays are sorted by an comparator that ensures that the value with the highest probability is at index '0'.</p>
+<p>Type safety is not checked so creating multiple Annotations with different value types that share the same key will cause ClassCastExceptions at runtime. </p>
+  </div>
+  
+  <div id="footer">
+    <div class="copyright">
+      <p>
+        Copyright &copy; 2010 The Apache Software Foundation, Licensed under 
+        the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+        <br />
+        Apache, Stanbol and the Apache feather and Stanbol logos are trademarks of The Apache Software Foundation.
+      </p>
+    </div>
+  </div>
+  
+</body>
+</html>
+

Added: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/nlpannotations
==============================================================================
--- websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/nlpannotations (added)
+++ websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/nlp/nlpannotations Fri Nov 23 13:11:31 2012
@@ -0,0 +1,222 @@
+title: NLP Annotations
+
+While the The [Analyzed Text](analyzedtext) interface allows to define Sentences, Chunks and Tokens within the text and also to attach annotations to those this part of the Stanbol NLP processing module provides the Java domain model for the annotations section this part of the Stanbol NLP processing module defines the Java domain model used for those annotations. This includes annotation models for Part of Speech (POS) tags, Chunks , recognized Named Entities (NER) as well as morphological analysis.
+
+### Part of Speech (POS) annotations
+
+Part of Speech (POS) tagging represents an token level annotation. It assigns tokens with categories like noun, verb, adjectives, punctuation ... This annotations are typically provided by an POS tagger that consumes Tokens and provides tag(s) with confidence(s) as output. Tags are usually string values that are member of a TagSet - a fixed list of tags used to annotate tokens. Those Tag sets are typically language and often even trainings corpus specific. This makes it really hard to consume POS tags created by different POS tagger for different languages as the consumer would need to know about the meanings of all the different POS tags for the different languages.
+
+The POS annotation model defined by the Stanbol NLP module tries to solve this issue by providing means to align POS tag sets with formal categories defined by the [OLiA Ontology](http://nlp2rdf.lod2.eu/olia/). The following sub-section will provide details and usage examples.
+
+#### OLiA MorphosyntacticCategories
+
+The '[OLiA](http://nlp2rdf.lod2.eu/olia/) Reference Model for Morphology and Morphosyntax, with experimental extension to Syntax' defines a set of ~150 formally defined and multi-lingual POS tags. Those types are defined as a non-cyclic multi-hierarchy with 'oilia:MorphosyntacticCategory' as common root.
+
+To give an example the POS 'olia:Gerund' is defined as a 'olia:NonFiniteVerb' what itself is a 'olia:Verb'. An example for a multi-hierarchy is 'olia:NominalQuantifier' that is both a 'olia:Noun' and a 'olia:Quantifier'.
+
+To allow support a nice integration of the formal definitions by the OLiA ontology within the Stanbol NLP annotations there are two Java enumerations:
+
+* __LexicalCategories__: This enumeration covers the 12 top level categories as defined by OLiA. This includes Noun, Verb, Adjective, Adposition, Adverb, Conjuction, Interjection, PronounOrDeterminer, Punctuation, Quantifier, Residual and Unique.
+* __Pos__: This enumeration covers all OLiA MorphosyntacticCategories from the 2+ level. So by using the _Pos_ enum one can e.g. distinguish between ProperNoun's and CommonNoun's or FiniteVerb's and NonFiniteVerb's ... The _Pos_ enumeration has full support for the multi-hierarchy as defined by OLiA. The Pos#categories() methods allows to get the 1st level parents of _Pos_. The Pos#hierarchy() returns all 2+ level parents of a _Pos_ member.
+
+#### PosTag and TagSet
+
+The PosTag represents a POS tag as used by an POS tagger. PosTags do support the following features:
+
+* __tag__ [1..1]::Stirng - This is the string tag as used by the POS tagger.
+* __category__ [0..*]::LexicalCategory - The assigned LexicalCategory enumeration members.
+* __pos__ [0..*]::Pos - The assigned Pos enumeration members.
+
+An Example for a PosTag representing a 'olia:ProperNoun' looks like follows
+
+    :::java
+    PosTag tag = new PosTag("NP", Pos.ProperNoun);
+
+The first parameter is the String POS tag used by the POS tagger and the second parameter represents the mapping to the OLiA MorphosyntacticCategories for this tag. The next example shows an sofisticated mapping for the "PWAV" (Pronominaladverb) as used by the STTS tag set for the German language
+
+    :::java
+    new PosTag("PWAV", LexicalCategory.Adverb, Pos.RelativePronoun, Pos.InterrogativePronoun);
+
+_TagSet_ is the other important class as it allows to manage the set of PosTag instances. _TagSet_ has two main functions: First it allows an integrator of an POS tagger with Stanbol to define the mappings from the string POS tags used by the Pos Tagger to the LexicalCategory and Pos enumeration members as preferable used by the Stanbol NLP chain. Second it ensures that there is only a single instance of PosTag used to annotate all Tokens with the same type.
+
+_TagSet_s are typically specified as static members of utility classes. The following code snippet shows an example
+
+    :::java
+    //Tagset is generically typed. We need a TagSet for PosTag's
+    public static final TagSet<PosTag> STTS = new TagSet<PosTag>(
+        "STTS", "de"); //define a name and the languages it supports
+
+    static {
+        //you can set properties to a TagSet. While supported this
+        //feature is currently not used by Stanbol
+        STTS.getProperties().put("olia.annotationModel",
+            new UriRef("http://purl.org/olia/stts.owl"));
+        STTS.getProperties().put("olia.linkingModel",
+            new UriRef("http://purl.org/olia/stts-link.rdf"));
+        STTS.addTag(new PosTag("ADJA", Pos.AttributiveAdjective));
+        STTS.addTag(new PosTag("ADJD", Pos.PredicativeAdjective));
+        STTS.addTag(new PosTag("ADV", LexicalCategory.Adverb));
+//[...]
+    }
+
+The string tag (first parameter) of the _PosTag_ is used as unique key by the _TagSet_. Adding an 2nd _PasTag_ with the same tag will override the first one. _PosTag_s that are added to a _TagSet_ have the _Tag#getAnnotationModel()_ property set to that model.
+
+The final example shows a code snippet shows the core part of an POS tagging engine using the both the [AnalyzedText](analyzedtext) and the _PosTag_ and _TagSet_ APIs.
+
+    :::java
+    TagSet<PosTag> tagSet; //the used TagSet
+    //holds PosTags for tags returned by the POS tagger that
+    //are missing in the TagSet
+    Map<String,PosTag> adhocTags = new HashMap<String,PosTag>():
+    List<Span> token = new ArrayList<Span>(64);
+
+    Iterator<Section> sentences; //Iterator over the sentences
+
+    while(sentences.hasNext()){
+        Section sentence = sentences.next();
+        //get the tokens of the current sentence
+        token.clean();
+        AnalysedTextUtils.appandToList(
+            sentence.getEnclosed(SpanTypeEnum.Token),
+            tokenList);
+        //typically one needs also to get the Strings
+        //of the tokens for the pos tagger
+        String[] tokenText = new String[tokenList.size()];
+        for(int i=0;i<tokens.size();i++){
+            tokenText[i] = tokens.get(i).getSpan();
+        }
+
+        //now POS tag the sentence
+        String[] posTags = posTagger.tag(tokens);
+
+        //finally apply the PosTags and save the annotation
+        for(int i=0;i<tokens.size();i++){
+            PosTag tag = tagSet.get(posTags[i]);
+            if(tag == null) { //unmapped tag
+                tag = adhocTags.get(posTags[i]);
+            }
+            if(tag == null) { //unknown tag
+                tag = new PosTag(posTags[i]);
+                adhocTags.put(posTags[i],tag);
+            }
+            //add the annotation to the Token
+            token.addAnnotation(
+                NlpAnnotations.POS_ANNOTATION,
+                Value.value(tag));
+        }
+    }
+
+### Phrase annotations
+
+Phrase annotations can be used to define the type of a _Chunk_. The _PhraseTag_ class is used for phrase annotations. It defines first a string tag and secondly the Phrase category. The _LexicalCategory_ enumeration is used as valued for the category. As the _PhraseTag_ is a subclass of _Tag_ it can be also used in combination with the _TagSet_ class as described in the [PosTag and TagSet] section.
+
+The following code snippets show how to create a PhraseTag for noun phrases
+
+    :::java
+    PhraseTag tag = new PhraseTag("NP", LexicalCategory.Noun);
+
+  
+
+### Name Entity (NER) annotations
+
+Named Entity annotations are created by NER modules. Before the Stanbol NLP chain they where represented in Stanbol by using '[fise:TextAnnotation](../enhancementstructure#fisetextannotation)'s and any Enhancement Engine that does NER should still support this. With the Stanbol NLP processing module it is now also possible to represent detected Named Entities as _Chunk_ with an PhraseTag added as Annotation.
+
+A Named Entity represented as 'fise:TextAnnotation' includes the following information:
+
+    urn:namedEntity:1
+        rdf:type fise:TextAnnotation, fise:Enhancement
+        fise:selected-text {named-entity-text}
+        fise:start {start-char-pos}
+        fise:end {end-char-pos}
+        dc:type {named-entity-type}
+
+where:
+
+* {named-entity-text} is the text recognized as Named Entity. This is the same as returned by _Chunk#getSpan()_
+* {start-char-pos} is the start character position of the Named Entity relative to the start of the text. This is the same as _Chunk#getStart()_
+* {end-char-pos} is the end position and the same as _Chunk#getEnd()_
+* {named-enttiy-type} is the type of the recognized Named Entity as URI. The _PhraseTag allows to define both the string tag as used by the NER component as well as the URI this type is mapped to. In Stanbol it is preferred to use 'dbpedia:Person', 'dbpedia:Organisation' and 'dbpedia:Place' for the according entity types.
+
+The _NerTag_ class extends _Tag_ and can therefore be also used with the _TagSet_ class. This means that users of the API can use _TagSet_ to manage the string tag to URI mappings for the supported Named Entity types.
+
+The following Code Snippets shows how to add NER annotations to the AnalysedText:
+
+    :::java
+    AnalysedText at; //The AnalysedText
+    TagSet<NerTag> nerTags; //registered NER tags
+    Iterator<Section> sections; //sections to iterate over
+
+    List<String> tokenTexts = new ArrayList<Span>(64);
+
+    while(sections.hasNext()){
+        Section section = sections.next();
+        //NER tagger typically need String[] as input
+        token.clean();
+Iterator<Token> tokens = section.getTokens;
+        while(tokens.hasNext()){
+            tokenTexts.add(tokens.next().getSpan());
+        }
+        //Span -> #start #end #type #probability
+Span[] nerSpans = nerTagger.tag(
+            tokenTexts.toArray(new String[tokenTexts.size()]);
+        for(int i=0; i < nerSpans.length; i++){
+            Chunk namedEntity = at.addChunk(
+                nerSpans[i].start,nerSpans[i].start);
+            NerTag tag = nerTags.get(nerSpans[i].type)
+            if(tag == null){ //unmapped NER
+                tag = new NerTag(nerSpans[i].type);
+            }
+            namedEntity.addAnnotation(
+                NlpAnnotations.NER_ANNOTATION,
+                Value.value(tag, nerSpans[i]. probability));
+        }
+    }
+   
+Note that the above Code Snippet only shows how to add the Named Entity to the AnalyzedText ContentPart. A actual NER engine Implementation needs also to add those information to the metadata of the [ContentItem](../contentitem).
+
+    :::java
+    ContentItem ci; //The processed ContentItem
+    Language lang; //The Language of the processed Text
+    MGraph metadata = ci.getMetadata();
+    Section section; //the current Section
+    Chunk namedEntity //the currently processed Named Entity
+
+    Value<NerTag> nerAnnotation = namedEntity.getAnnotation(
+        NlpAnnotations.NER_ANNOTATION);
+
+    UriRef textAnnotation = EnhancementEngineHelper.createTextEnhancement(ci, this);
+    metadata.add(new TripleImpl(textAnnotation, ENHANCER_SELECTED_TEXT,
+        new PlainLiteralImpl(namedEntity.getSpan(), language)));
+    metadata.add.add(new TripleImpl(textAnnotation, ENHANCER_SELECTION_CONTEXT,
+        new PlainLiteralImpl(section.getSpan(), language)));
+    if(tag.getType() != null){
+        metadata.add(new TripleImpl(textAnnotation, DC_TYPE,
+            nerAnnotation.value().getType));
+    } //else do not add an dc:type for unmapped NamedEntities
+    g.add(new TripleImpl(textAnnotation, ENHANCER_CONFIDENCE,
+        literalFactory.createTypedLiteral(nerAnnotation.probability())));
+    g.add(new TripleImpl(textAnnotation, ENHANCER_START,
+        literalFactory.createTypedLiteral(namedEntity.getStart()));
+    g.add(new TripleImpl(textAnnotation, ENHANCER_END,
+        literalFactory.createTypedLiteral(namedEntity.getEnd())));
+
+
+### Morphological Analyses
+
+
+__NOTE:__ _This part of the Stanbol NLP annotations is still work in progress. So this part of the API might undergo heavy changes even in minor releases._
+
+
+The results of a Morphological Analyses are represented by the _MorphoFeatures_ class and can be added to the analyzed word (_Token_) by using the _NlpAnnotations.MORPHO_ANNOTATION_. The _MorphoFeatures_ class provides the following features:
+
+* __Lemma__: A String value representing the lemmatization of the annotated Token.
+* __Case__: The _Case_ enumeration contains around 70 members defined based on concepts of the [OLiA Ontology](http://nlp2rdf.lod2.eu/olia/). The _CaseTag_ allows to define cases and optionally map them to the cases defined by the enumeration.
+* __Definitness__: The _Definitness_ enumeration has the members Definite and Indefinite also defined by Concepts in the [OLiA Ontology](http://nlp2rdf.lod2.eu/olia/).
+* __Gender__: The _Gender_ enumeration contains the six gender defined by the [OLiA Ontology](http://nlp2rdf.lod2.eu/olia/). The _GenderTag_ allows to define Genders and optionally map them to the gender defined by the enumeration.
+* __Number__: The _NumberFeature_ enumeration defines the eight number features defined by [OLiA](http://nlp2rdf.lod2.eu/olia/). The _NumberTag_ can be used to define number features and map them to the members of the enumeration
+* __Person__: the _Person_ enumeration has the definitions for 'first', 'second' and 'third' with mappings to the according concepts of the [OLiA Ontology](http://nlp2rdf.lod2.eu/olia/).
+* __Tense__: The _Tense_ enumeration represents the tense hierarchy as defined by the [OLiA Ontology](http://nlp2rdf.lod2.eu/olia/). the _Tense#getParent()_ allows access to the direct parent of a _Tense_ while the _Tense#getTenses()_ method can be used to obtain the transitive closure (including the _Tens_ object itself). _TenseTag_ is used for Tense annotations. It allows both to parse a string tag representing the tense as well as defining a mapping to the tenses defined by the _Tense_ enumeration.
+* __Mood__: The _VerbMood_ enumeration currently defines members from different part of the [OLiA Ontology](http://nlp2rdf.lod2.eu/olia/). While OLiA does define the 'ilia:MoodFeature' class but those members had not a good match with verb moods as used by the CELI/linguagrid.org service. For now the decision was to define the _VerbMood_ enumeration more closely to the usage of CELI, but this needs clearly to be validated as soon as implementations for other NLP frameworks are added. Their is also a _VerbMoodTag_ that allows to define verb moods by a string tag and an mapping to the _VerbMood_ enumeration.
+
+ 
+The _MorphoFeatures_ supports multi valued annotations for all the above features. Getter for a single value will always return the first added value.
\ No newline at end of file