You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by bu...@apache.org on 2012/11/23 13:44:47 UTC

svn commit: r839303 - in /websites/staging/stanbol/trunk/content: ./ docs/trunk/components/enhancer/engines/entityhublinking.html docs/trunk/components/enhancer/engines/entityhublinking.png docs/trunk/components/enhancer/engines/entitylinking.html

Author: buildbot
Date: Fri Nov 23 12:44:46 2012
New Revision: 839303

Log:
Staging update by buildbot for stanbol

Added:
    websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.html
    websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.png   (with props)
Modified:
    websites/staging/stanbol/trunk/content/   (props changed)
    websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.html

Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Nov 23 12:44:46 2012
@@ -1 +1 @@
-1412829
+1412857

Added: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.html
==============================================================================
--- websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.html (added)
+++ websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.html Fri Nov 23 12:44:46 2012
@@ -0,0 +1,123 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE- 2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+  <link href="/css/stanbol.css" rel="stylesheet" type="text/css">
+  <title>Apache Stanbol - The Entityhub Linking Engine: Linking NLP processed Text with Vocabularies managed by the Stanbol Entityhub</title>
+  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+  <link rel="icon" type="image/png" href="/images/stanbol-logo/stanbol-favicon.png"/>
+  <script type="text/javascript">
+    // Google Analytics Tracking Code
+    var _gaq = _gaq || [];
+    _gaq.push(['_setAccount', 'UA-32086816-1']);
+    _gaq.push(['_trackPageview']);
+
+    (function() {
+      var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+      ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+      var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+    })();
+  </script>  
+</head>
+
+<body>
+  <div id="logo"> <!-- do not scroll the logo -->
+  <a href="/index.html"><img alt="Apache Stanbol" width="220" height="101" border="0" src="/images/stanbol-logo/stanbol-2010-12-14.png"/></a></div>
+  <div id="navigation"> <!-- but auto scroll the menue -->
+      <h1 id="stanbol">Stanbol</h1>
+<ul>
+<li><a href="/index.html">Home</a></li>
+<li><a href="/docs/trunk/tutorial.html">Getting Started</a></li>
+<li><a href="/docs/trunk/">Documentation</a><ul>
+<li><a href="/docs/trunk/scenarios.html">Usage Scenarios</a></li>
+<li><a href="/docs/trunk/components">Components</a></li>
+</ul>
+</li>
+<li><a href="/development/">Development</a></li>
+<li><a href="/production/">Production</a></li>
+</ul>
+<h1 id="project">Project</h1>
+<ul>
+<li><a href="/docs/trunk/mailinglists.html">Mailing Lists</a></li>
+<li><a href="https://issues.apache.org/jira/browse/STANBOL">Issue Tracker</a></li>
+<li><a href="/team.html">Project Team</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
+<li><a href="/privacy-policy.html">Privacy Policy</a></li>
+</ul>
+<h1 id="downloads">Downloads</h1>
+<ul>
+<li><a href="/downloads/">Overview</a><ul>
+<li><a href="/downloads/releases.html">Releases</a></li>
+<li><a href="/downloads/launchers.html">Launchers</a></li>
+</ul>
+</li>
+</ul>
+<h1 id="archive">Archive</h1>
+<ul>
+<li><a href="/docs/0.9.0-incubating/">0.9.0-incubating</a></li>
+</ul>
+<h1 id="the-asf">The ASF</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a Sponsor</a></li>
+<li><a href="http://www.apache.org/security/">Security</a></li>
+</ul>
+  </div>
+  <div id="content">
+    <div class="breadcrumbs">
+      <ul> <li><a href="/">Home</a></li> <li class="item"><a href="/docs/">Docs</a></li> <li class="item"><a href="/docs/trunk/">Trunk</a></li> <li class="item"><a href="/docs/trunk/components/">Components</a></li> <li class="item"><a href="/docs/trunk/components/enhancer/">Enhancer</a></li> <li class="item"><a href="/docs/trunk/components/enhancer/engines/">Engines</a></li> </ul>
+    </div>
+    <h1 class="title">The Entityhub Linking Engine: Linking NLP processed Text with Vocabularies managed by the Stanbol Entityhub</h1>
+    <p>The EntityhubLinkingEngine is the successor of the <a href="keywordlinkingengine">KeywordLinkingEngine</a>. It is based on the <a href="entitylinkingengine">EntityLinkingEngine</a> configured with an <a href="entitylinkingengine#entitysearcher">EntitySearcher</a> that can link Entities managed by either the Entityhub, ReferencedSites as well as ManagedSites. The EntityhubLinkingEngine does not implement the <a href="index">EnhancementEngine</a> interface itself. It only configures an instance of the <a href="entitylinkingengine">EntityLinkingEngine</a>.</p>
+<p>For a detailed documentation of the linking process please see the documentation of the <a href="entitylinkingengine">EntityLinkingEngine</a>. This document only focuses on the configuration and the usage of this Engine.</p>
+<h2 id="configuration">Configuration</h2>
+<p>The configuration of the EntityhubLinkingEngine supports the following options. First it allows to configure the two properties common to all enhancement engines</p>
+<ul>
+<li><strong>Name</strong> <em>(stanbol.enhancer.engine.name)</em>: The name of the Enhancement Engine. This name is used to refer an <a href="index.html">EnhancementEngine</a> in <a href="enhancementchain.html">EnhancementChain</a>s</li>
+<li><strong>ServiceRankging</strong> <em>(service.ranking)</em>: In case multiple enhancement engines do use the same name, than only the one with the higher ranking will get uses.</li>
+</ul>
+<p>Next it allows to configure the used Entityhub Site</p>
+<ul>
+<li><strong>Referenced Site</strong> <em>(enhancer.engines.linking.entityhub.siteId)</em>: The name of the ReferencedSite of the Stanbol Entityhub that holds the controlled vocabulary to be used for extracting Entities. "entityhub" or "local" can be used to extract Entities managed directly by the Entityhub.</li>
+</ul>
+<p>Finally it supports all configuration options supported by the <a href="entitylinkingengine">EntityLinkingEngine</a>.</p>
+<ul>
+<li><a href="entitylinking#text-processing-configuration">Text Processing Configuration</a>: This defines what languages are enabled and is also used to configure how NLP processing results are used by the Engine</li>
+<li><a href="entitylinking#entity-linker-configuration">Entity Linking Configuration</a>: This defines how entity are searched in the vocabulary and search results are matched with the text. It also allows to configure 'dc:type's for created 'fise:TextAnnotation's and if entity information are included in the enhancement results or not.</li>
+</ul>
+<p>The following screenshot shows the configuration dialog of the EntityhubLinkingEngine as shown when using the Apache Felix Webconsole for its configuration. However users need to know that this dialog only provides a limited set of configuration options. Other supported configuration options can only be configured by directly using OSGI "*.config" files.</p>
+<p><img alt="Configuration dialog for the EntityhubLinkingEngine" src="entityhublinkigconfig.png" /></p>
+  </div>
+  
+  <div id="footer">
+    <div class="copyright">
+      <p>
+        Copyright &copy; 2010 The Apache Software Foundation, Licensed under 
+        the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+        <br />
+        Apache, Stanbol and the Apache feather and Stanbol logos are trademarks of The Apache Software Foundation.
+      </p>
+    </div>
+  </div>
+  
+</body>
+</html>
+

Added: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.png
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entityhublinking.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Modified: websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.html
==============================================================================
--- websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.html (original)
+++ websites/staging/stanbol/trunk/content/docs/trunk/components/enhancer/engines/entitylinking.html Fri Nov 23 12:44:46 2012
@@ -91,10 +91,10 @@
 <ul>
 <li><strong>Text Processing Configuration</strong>: This configures how the EntityLinkingEngine consumes NLP processing results. Such configurations can be language specific.</li>
 <li><strong>Entity Linking Configuration</strong>: This configures various properties that are used for the linking process with the vocabulary</li>
-<li><strong>EntitySearcher</strong>: This interface is used to search and dereference Entities. It needs to be implemented to use a datasource for linking with the EntityLinkingEngine. Stanbol provides implementations for the Stanbol Entityhub (see <a href="entityhublinkingengine">EntityhubLinkingEngine</a>)</li>
+<li><strong>EntitySearcher</strong>: This interface is used to search and dereference Entities. It needs to be implemented to use a datasource for linking with the EntityLinkingEngine. Stanbol provides implementations for the Stanbol Entityhub (see <a href="entityhublinking">EntityhubLinkingEngine</a>)</li>
 <li><strong>LabelTokenizer</strong>: While processed text is already tokenized the Entity labels are note. For the matching of Labels with the text the EntityLinkingEngine needs therefore to tokenizer those labels. Apache Stanbol provides an default implementation of this interface based on the <a href="http://opennlp.apache.org">OpenNLP</a> tokenizer API.</li>
 </ul>
-<p>The EntityLinkingEngine can not directly be used as the four things listed above need to be parsed in its constructor. It is instead intended to be configured/extended by other components. The <a href="entityhublinkingengine">EntityhubLinkingEngine</a> is one of them configuring the EntityLinkingEngine with EntitySearcher for the Stanbol Entityhub.</p>
+<p>The EntityLinkingEngine can not directly be used as the four things listed above need to be parsed in its constructor. It is instead intended to be configured/extended by other components. The <a href="entityhublinking">EntityhubLinkingEngine</a> is one of them configuring the EntityLinkingEngine with EntitySearcher for the Stanbol Entityhub.</p>
 <p>This documentation first describes the implemented entity linking process than provides information about the supported configuration parameters of the <em>Text Processing Configuration</em> and the <em>Entity Linking Configuration</em>. The last part described how to extend the EntityLinking engine by implementing/providing custom <em>EntitySearcher</em> and <em>LabelTokenizer</em>.</p>
 <h2 id="linking-process">Linking Process:</h2>
 <p>The Linking Process consists of three major steps: First it consumes results of the NLP processing to determine tokens - words - that need to be linked with the configured vocabulary. Second the linking of entities based on their labels with the current section of the Text and third the writing of the enhancement results.</p>
@@ -154,7 +154,7 @@
 <p>The configuration of the EntityLinkingEngine done by parsing a <em>TextProcessingConfig</em> and an <em>EntityLinkingConfig</em> in it constructor. Both configuration classes provide an API base configuration (via getter and setter) as well as an OSGI Dictionary based configuration (via a static method that configures a new instance by an parsed configuration).</p>
 <p>The following two sections describe the "key, value" based configuration as the API based version is anyway described by the JavaDoc.</p>
 <h3 id="text-processing-configuration">Text Processing Configuration</h3>
-<h4 id="proper-noun-linking-wzxhzdk12enhancerengineslinkingpropernounsstatewzxhzdk13">Proper Noun Linking <small><em>(enhancer.engines.linking.properNounsState)</em></small></h4>
+<h4 id="proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15">Proper Noun Linking <small><em>(enhancer.engines.linking.properNounsState)</em></small></h4>
 <p>This is a high level configuration option allowing users to easily specify if they want to do EntityLinking based on any Nouns ("Noun Linking") or only ProperNouns ("Proper Noun Linking").
 Configuration wise this will pre-set the defaults for the linkable <em>LexcicalCategories</em> and <em>Pos</em> types.</p>
 <p>"Noun linking" is equivalent to the behavior of the <a href="keywordlinkingengine">KeywordLinkingEngine</a> while "Proper Noun Linking" is similar to using NER (Named Entity Recognition) with the <a href="namedentityextractionengine">NamedEntityLinking</a> engine. </p>
@@ -168,7 +168,7 @@ Configuration wise this will pre-set the
 </li>
 </ol>
 <p>If suitable it is strongly recommended to activate "Proper Noun Linking" as it highly increases the performance because in typical text only around 1/10 of the Nouns are marked as Proper Nouns and therefore the amount of vocabulary lookups also decreases by this amount.</p>
-<h4 id="language-processing-configuration-wzxhzdk14enhancerengineslinkingprocessedlanguageswzxhzdk15">Language Processing configuration <small><em>(enhancer.engines.linking.processedLanguages)</em></small></h4>
+<h4 id="language-processing-configuration-wzxhzdk16enhancerengineslinkingprocessedlanguageswzxhzdk17">Language Processing configuration <small><em>(enhancer.engines.linking.processedLanguages)</em></small></h4>
 <p>This parameter is used for two things: (1) to specify what languages are processed and (2) to provide specific configurations on how languages are processed. For the 2nd aspect there is also a default configuration that can be extended with language specific setting.</p>
 <p><strong>1. Processed Languages Configuration:</strong></p>
 <p>For the configuration of the processed languages the following syntax is used:</p>
@@ -303,17 +303,54 @@ Configuration wise this will pre-set the
 <h2 id="extension-points">Extension Points</h2>
 <p>This section describes Interfaces that are used as Extension Points by the KeywordLinkingEngine</p>
 <h3 id="entitysearcher">EntitySearcher</h3>
-<p>The EntitySearch Interface is used by the EntityLinkingEngine to search for Entities in the linked Vocabulary. This interface supports two main functionalities:</p>
-<p><strong>Dereference Entities</strong> <em>get(String id,Set<String> includeFields)::Representation</em></p>
+<p>The EntitySearch Interface is used by the EntityLinkingEngine to search for Entities in the linked Vocabulary. An EntitySearcher instance is parsed in the constructor of the EntityLinkingEngine.</p>
+<p>This interface supports with search and dereference two main functionalities but also provides some additional metadata. The following list provides a short overview about the methods.</p>
+<ul>
+<li><strong>Dereference Entities</strong> <em>get(String id,Set&lt;String&gt; includeFields)::Representation</em></li>
+</ul>
 <p>This method is called with the 'id' of an Entity and needs to return the data of the Entity as <em>Representation</em>. The returned <em>Representation</em> needs to at least include the parsed 'includeFields'. If 'includeFields' is empty or NULL than all information for the Entity should be included in the returned <em>Representation</em>.</p>
-<p><strong>Entity Search</strong> __lookup(String field, Set<String> includeFields, List<String> search, String[] languages,Integer limit)::Collection<Representation></p>
-<p>This method is used for searching entities in the controlled vocabulary. The configured <em>Label Field</em> is parsed in the 'field' parameter. The 'includedFileds' contain all fields required for the linking process. _Representation_s returned as result need to include values for those fields. The 'search' parameter includes the tokens used for the search. Values should be considered optional however Results are considered to rank Entities that match more search entires first.</p>
+<ul>
+<li><strong>Entity Search</strong> <em>lookup(String field, Set&lt;String&gt; includeFields, List&lt;String&gt; search, String[] languages,Integer limit)::Collection&lt;Representation&gt;</em></li>
+</ul>
+<p>This method is used for searching entities in the controlled vocabulary. The configured <em>Label Field</em> is parsed in the 'field' parameter. The 'includedFileds' contain all fields required for the linking process. _Representation_s returned as result need to include values for those fields. The 'search' parameter includes the tokens used for the search. Values should be considered optional however Results are considered to rank Entities that match more search tokens first. The array of 'languages' is used to parse the languages that need to be considered for the search. If 'languages' contains NULL or '' it means that also labels without an language tag need to be included in the search (NOTE that this DOES NOT mean to include labels of any language!). Finally the 'limit' parameter is used to specify the maximum number of results. If NULL than the implementation can choose an meaningful default.</p>
+<ul>
+<li><strong>Offline Mode</strong> <em>supportsOfflineMode()::boolean</em> : indicates if the EntitySearcher implementation needs to connect an remote service. This is needed to deactivate the EntityLinkingEngine in cases where Apache Stanbol is started in OfflineMode</li>
+<li><strong>Serach Result Limit</strong> <em>getLimit()::Integer</em> : The maximum number of search results supported by the EntitySearcher implementation. Can return NULL if not applicable or unknown.</li>
+<li><strong>Origin Information</strong> <em>getOriginInformation()::Map&lt;UriRef,Collection&lt;Resource&gt;&gt;</em> : This method allows to return information about the origin that are added to every 'fise:EntityAnnotation' created by the EntityLinkingEngine. This is e.g. used by the Entityhub based information to provide the 'id' of the Entiyhub Site where the Entities where retrieved from. </li>
+</ul>
+<p>The <a href="entityhublinking">EntityhubLinkingEngine</a> includes EntitySearcher implementations based on the FieldQuery search interface implemented by the Stanbol Entityhub.</p>
 <p>Currently the StanbolEntityhub based implementations are instantiated based on the value of the <em>'org.apache.stanbol.enhancer.engines.keywordextraction.referencedSiteId'</em>. Users that want to use a different implementation of this Interface to be used for linking will need to extend the KeywordLinkingEngine and override the #activateEntitySearcher(ComponentContext context, Dictionary<String,Object> configuration) and #deactivateEntitySearcher(). Those methods are called during activation/deactivation of the KeywordLinkingEngine and are expected to set/unset the #entitySearcher field.</p>
 <h3 id="labeltokenizer">LabelTokenizer</h3>
-<p>The LabelTokenizer interface is used to tokenize labels of Entities from the linked Vocabulary. As the matching process of the KeywordLinkingEngine is based on Tokens (words) multi-word labels (e.g. Univerity of Munich) need to be tokenized before they can be matched against the current context in the Text.</p>
-<p>LabelTokenizer are OSGI services. Their configuration optionally can define the <em>'enhancer.engines.keywordextraction.labeltokenizer.languages'</em> property. Values are considered to be language configurations. Configurations can explicitly include/exclude languages. Also a wildcard is supported (e.g. "en,de" would include English and German; "!it,!fr,<em>" would specify all languages expect Italian and French. If no configuration is provided than "</em>" (all languages) is assumed.</p>
+<p>The LabelTokenizer interface is used to tokenize labels of Entity suggestions as returned by the <a href="#entitysearcher">EntitySearcer</a>. As the matching process of the KeywordLinkingEngine is based on Tokens (words) multi-word labels (e.g. Univerity of Munich) need to be tokenized before they can be matched against the current context in the Text.</p>
+<p>The <em>LabelTokenizer</em> interface defines only the single <em>tokenize(String label, String language)::String[]</em> method that gets the label and the language as parameter and returns the tokens as a String array. If the tokenizer where not able to tokenize the label (e.g. because he does not support the language) it MUST return NULL. In this case the NamedEntityLinking engine will try to match the label as a single token.</p>
+<h4 id="mainlabeltokenizer">MainLabelTokenizer</h4>
+<p>As it might very likely be the case that users will want to use multiple LabelTokenizer for different languages the EntityLinkingEngine comes with an MainLabelTokenizer implementation. It registers itself as LabelTokenizer with highest possible OSGI 'service.ranking' and tracks all other registered _LabelTokenizer_s.</p>
+<p>So if custom <em>LabelTokenizer_s register themselves as OSGI service than the MainLabelTokenizer can forward requests to them. It will do so in the order of the '<code>service.ranking</code>'s. in addition _LabelTokenizer</em> can use the '<code>enhancer.engines.keywordextraction.labeltokenizer.languages</code>' property to formally specify the languages they are supporting. This property does use the language configuration syntax (e.g. "en,de" would include English and German; "!it,!fr,<em>" would specify all languages expect Italian and French). If no configuration is provided than "</em>" (all languages) is assumed - what is fine as default as long as <em>LabelTokenizer</em> correctly return NULL for languages they do not support.</p>
+<p>The MainLabelTokenizer forwards tokenize requests to all available LabelTokenizer implementations that support a specific language sorted by their '<code>service.ranking</code>' until the first one does NOT return NULL. If no LabelTokenizer was found or all returned NULL it will also return NULL.</p>
+<p>The following code snippet shows how to use the <em>MainLabelTokenizer</em> as <em>LabelTokenizer</em> for the <em>EntityLinkingEngine</em></p>
+<div class="codehilite"><pre><span class="nd">@Reference</span>
+<span class="n">LabelTokenizer</span> <span class="n">labelTokenizer</span><span class="o">;</span>
+</pre></div>
+
+
+<p>This will inject the MainLabelTokenizer as it uses <code>Integer.MAX_VALUE</code> as <code>service.ranking</code>.</p>
+<div class="codehilite"><pre><span class="nd">@Activate</span>
+<span class="kd">protected</span> <span class="kt">void</span> <span class="nf">activate</span><span class="o">(</span><span class="n">ComponentContext</span> <span class="n">ctx</span><span class="o">){</span>
+    <span class="c1">//within the activate method it can than be used</span>
+    <span class="c1">//to initialize the NamedEntityLinkingEngine</span>
+    <span class="n">NamedEntityLinkingEngine</span> <span class="n">engine</span> <span class="o">=</span> <span class="k">new</span> <span class="n">NamedEntityLinkingEngine</span><span class="o">(</span>
+        <span class="n">engineName</span><span class="o">,</span>
+        <span class="n">entitySearcher</span><span class="o">,</span> <span class="c1">//the searcher might not be available</span>
+        <span class="n">textProcessingConfig</span><span class="o">,</span> <span class="n">linkerConfig</span><span class="o">,</span> <span class="c1">//config</span>
+        <span class="n">labelTokenizer</span><span class="o">);</span> <span class="c1">//the MainLabelTokenizer</span>
+</pre></div>
+
+
+<p>Configuring the NamedEntityLinkingEngine like this ensures that all registered _LabelTokenizer_s are considered for tokenizing.</p>
+<h4 id="opennlp-labeltokenizer">OpenNLP LabelTokenizer</h4>
+<p>This is the default implementation of an LabelTokenizer based on the <a href="http://opennlp.apache.org">OpenNLP</a> tokenizer API. Internally it uses the OpenNLP service to load tokenizer models for languages. If language specific model is available it uses the OpenNLP SimpleTokenizer implementation. The <em>OpenNlpLabelTokenizer</em> registers itself with a '<code>service.ranking</code>' of '-1000' so it will b</p>
+<p>The <em>LabelTokenizerManager</em> interface extends the _</p>
 <p>The KeywordLinkingEngine will - by default - always use the LabelTokenizer with the highest "service.ranking" for a given language to tokenize labels. By default it comes with an OpenNLP based Tokenizer implementation that registers itself for all languages with a "service.ranking" of "-1000".</p>
-<p>Users that want to use a different Tokenizer need to register an implementation for the given language(s) with an higher "service.ranking". Users that want to provide there own LabelTokenizer and ignore the values provided by OSGI need to extend the KeywordLinkingEngine set the #labelTokenizer field themself AND override the #bindLabelTokenizer(LabelTokenizerManager ltm) and #unbindLabelTokenizer(LabelTokenizerManager ltm) methods in a way that they do NOT change the #labelTokenizer field. </p>
   </div>
   
   <div id="footer">