You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by bu...@apache.org on 2015/04/16 10:28:12 UTC

svn commit: r947858 - in /websites/staging/stanbol/trunk/content: ./ docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png docs/trunk/components/enhancer/engines/lucenefstlinking.html

Author: buildbot
Date: Thu Apr 16 08:28:12 2015
New Revision: 947858

Log:
Staging update by buildbot for stanbol

Added:
    websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png   (with props)
Modified:
    websites/staging/stanbol/trunk/content/   (props changed)
    websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.html

Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Thu Apr 16 08:28:12 2015
@@ -1 +1 @@
-1640733
+1674017

Added: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/fstengine-config-linking-mode-specific-components.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Modified: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.html
==============================================================================
--- websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.html (original)
+++ websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/lucenefstlinking.html Thu Apr 16 08:28:12 2015
@@ -166,15 +166,33 @@ Configurations can be created by using t
 
 
 <h4 id="linking-mode">Linking Mode</h4>
-<p>The FST linking engine does support two different linking modes. Those are configures using the <strong>Linking Mode</strong> <em>(enhancer.engines.linking.lucenefst.mode)</em> property.</p>
-<p><img alt="Linking Mode Configuration" src="fstengine-config-linkingmode.png" /></p>
-<p>The two modes are</p>
+<p>The FST linking engine does support three different linking modes. Those are configures using the <strong>Linking Mode</strong> <em>(enhancer.engines.linking.lucenefst.mode)</em> property. The linking mode property is no longer part of the configuration form. as their are now three separate components with a specialized configuration for each linking mode.</p>
+<p>The three modes are</p>
 <ol>
-<li><code>PLAIN</code>: This mode links the plain text with the vocabulary. Every single word of the text will get looked up with the vocabulary. This mode does not use NLP results other than language detection. This mode also ot make use of the <a href="#text-processing-configuration">Text Processing Configuration</a>. The PLAIN mode works fine with smaller and specific vocabularies that do not only contain entities but also things like product ids, activities, adjectives ...</li>
-<li><code>LINKABLE_TOKEN</code>: This mode links only linkable tokens of the parsed text. The provided <a href="#text-processing-configuration">Text Processing Configuration</a> is used to determine linkable tokens in the text (based on NLP results). This is the default mode for this engine. It is well suited for vocabularies containing named entities (such as persons, cities, products, organizations, roles, ...)
-<!-- 3. <code>NER</code>: This mode will only consider detected Named Entities for linking. This mode is similar to using the <a href="namedentitytaggingengine">Named Entity Linking Engine</a>. This is a best mode if the enhancement chain contains an NER component that can detect the types of entities contained in the linked vocabulary. --></li>
+<li><code>PLAIN</code>: This mode links the plain text with the vocabulary. Every single word of the text will get looked up with the vocabulary. This mode does not use NLP results other than language detection. Because of that this mode will ignore any <a href="#text-processing-configuration">Text Processing Configuration</a>. The PLAIN mode works fine with smaller and specific vocabularies that do not only contain entities but also things like product ids, activities, adjectives ...</li>
+<li><code>LINKABLE_TOKEN</code>: This mode links only linkable tokens of the parsed text. The provided <a href="#text-processing-configuration">Text Processing Configuration</a> is used to determine linkable tokens in the text (based on NLP results). This is the default mode for this engine. It is well suited for vocabularies containing named entities (such as persons, cities, products, organizations, roles, ...)</li>
+<li><code>NER</code>: This mode will only consider detected Named Entities for linking. This mode is similar to using the <a href="namedentitytaggingengine">Named Entity Linking Engine</a>. This is a best mode if the enhancement chain contains an NER component that can detect the types of entities contained in the linked vocabulary. Important for this mode is that Named Entity types can be mapped to types of Entities in the linked vocabulary. This allows to validate matching entities based on their type. Those mappings are configured by the <strong>Named Entity Type Mappings</strong> <em>(enhancer.engines.linking.lucenefst.neTypeMapping)</em> property.</li>
 </ol>
+<p>The <em>Named Entity Type Mappings</em> uses the following syntax:</p>
+<div class="codehilite"><pre><span class="p">{</span><span class="n">named</span><span class="o">-</span><span class="n">entity</span><span class="o">-</span><span class="n">type</span><span class="p">}</span> <span class="o">&gt;</span> <span class="p">{</span><span class="n">voc</span><span class="o">-</span><span class="n">type</span><span class="o">-</span>1<span class="p">}[;</span> <span class="p">{</span><span class="n">voc</span><span class="o">-</span><span class="n">type</span><span class="o">-</span>2<span class="p">};</span> <span class="p">...]</span>
+</pre></div>
+
+
+<p>meaning that the Named Entities with the <code>{named-entity-type}</code> will only accept entities in the vocabulary with one of the <code>{voc-type-1}, {voc-type-2}, ...</code> types. Entities of other types that would match the mention of the Named Entities will get filtered. </p>
+<p>An typical configuration could look like the following.</p>
+<div class="codehilite"><pre><span class="n">dbp</span><span class="o">-</span><span class="n">ont</span><span class="p">:</span><span class="n">Person</span> <span class="o">&gt;</span> <span class="n">dbp</span><span class="o">-</span><span class="n">ont</span><span class="p">:</span><span class="n">Person</span><span class="p">;</span> <span class="n">schema</span><span class="p">:</span><span class="n">Person</span><span class="p">;</span> <span class="n">foaf</span><span class="p">:</span><span class="n">Person</span>
+<span class="n">dbp</span><span class="o">-</span><span class="n">ont</span><span class="p">:</span><span class="n">Organisation</span> <span class="o">&gt;</span> <span class="n">dbp</span><span class="o">-</span><span class="n">ont</span><span class="p">:</span><span class="n">Organisation</span><span class="p">;</span> <span class="n">dbp</span><span class="o">-</span><span class="n">ont</span><span class="p">:</span><span class="n">Newspaper</span><span class="p">;</span> <span class="n">schema</span><span class="p">:</span><span class="n">Organization</span>
+<span class="n">dbp</span><span class="o">-</span><span class="n">ont</span><span class="p">:</span><span class="n">Place</span> <span class="o">&gt;</span> <span class="n">dbp</span><span class="o">-</span><span class="n">ont</span><span class="p">:</span><span class="n">Place</span><span class="p">;</span> <span class="n">schema</span><span class="p">:</span><span class="n">Place</span><span class="p">;</span> <span class="n">geonames</span><span class="p">:</span><span class="n">Feature</span>
+</pre></div>
+
+
+<p><em>NOTE:</em> Also full URIs can be used</p>
 <p>By default the FST linking engine uses the <code>LINKABLE_TOKEN</code>. In this mode this engine behaves similar as the <a href="entityhublinking">Entityhub Linking Engine</a>.</p>
+<p>As mentioned before three OSGI components are provided for configuring FST linking engines with the different modes:</p>
+<p><img alt="Linking Mode specific Components" src="fstengine-config-linking-mode-specific-components.png" /></p>
+<p>The <strong>Apache Stanbol Enhancer Engine: FST Linking: Linkable Token</strong> <em>(org.apache.stanbol.enhancer.engines.lucenefstlinking.FstLinkingEngineComponent)</em> is the default FstLinkingEngine component. It supports all configuration parameter. When not using the user interface it is strongly recommended to use this component for the configuration of the FST linking engine.</p>
+<p>The <strong>Apache Stanbol Enhancer Engine: FST Linking: Plain</strong> <em>(org.apache.stanbol.enhancer.engines.lucenefstlinking.PlainFstLinkingComponnet)</em> can be used to configure a <code>PLAIN</code> mode linking engine. The form excludes any <a href="#text-processing-configuration">Text Processing Configuration</a> property as those are anyway not used in the <code>PLAIN</code> mode.</p>
+<p>The <strong>Apache Stanbol Enhancer Engine: FST Linking: Named Entities</strong> <em>(org.apache.stanbol.enhancer.engines.lucenefstlinking.NamedEntityFstLinkingComponnet)</em> is intended to allow the configuration of a FST linking engine in the <code>NER</code> mode. It includes the <strong>Named Entity Type Mappings</strong> <em>(enhancer.engines.linking.lucenefst.neTypeMapping)</em> property in the form. This is used to configure type mappings from the Named Entity types to types in the linked vocabulary.</p>
 <h4 id="additional-entity-information">Additional Entity Information</h4>
 <p><img alt="Additional Fields config" src="fstengine-config-addfields.png" title="Fields the types and rankings of entities are read from" /></p>
 <p>In addition to the URI and the labels of Entities the EntityLinking process also uses entity type and ranking information.</p>