You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by bu...@apache.org on 2012/02/10 17:54:20 UTC

svn commit: r804438 - /websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/index.html

Author: buildbot
Date: Fri Feb 10 16:54:20 2012
New Revision: 804438

Log:
Staging update by buildbot for stanbol

Modified:
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/index.html

Modified: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/index.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/index.html (original)
+++ websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/index.html Fri Feb 10 16:54:20 2012
@@ -57,18 +57,18 @@
   
   <div id="content">
     <h1 class="title">Enhancement Engines</h1>
-    <p>Enhancement engines are the components that are responsible to enhance ContentItmes. They are called by the <a href="../enhancementjobmanager.html">EnhancementJobManager</a>. Enhancement engines do have full access to the parsed <a href="../contentitem.html">ContentItem</a>s. They are expected to modify the state of the content item.</p>
-<p>The RESTful interface of an EnhancementEngines can be accessed by</p>
+    <p>Enhancement engines are the components that are responsible to enhance <a href="../contentitem.html">content itmes</a>. They are called by the <a href="../enhancementjobmanager.html">Enhancement Job Manager</a>. Enhancement engines do have full access to the parsed content items. They are expected to modify their state.</p>
+<p>The RESTful interface of an enhancement engine can be accessed by</p>
 <div class="codehilite"><pre><span class="n">http:</span><span class="sr">//</span><span class="p">{</span><span class="n">host</span><span class="p">}:{</span><span class="n">port</span><span class="p">}</span><span class="sr">/{stanbol-root}/</span><span class="n">enhancer</span><span class="sr">/engine/</span><span class="p">{</span><span class="n">engine</span><span class="o">-</span><span class="n">name</span><span class="p">}</span>
 </pre></div>
 
 
-<p>e.g. an EnhancementEngine with the name "ner" running at a Apache Stanbol instance on local host with the default configuration will be accessible at</p>
+<p>e.g. an enhancement engine with the name "ner" running at a Apache Stanbol instance on local host with the default configuration will be accessible at</p>
 <div class="codehilite"><pre><span class="n">http:</span><span class="sr">//</span><span class="n">localhost:8080</span><span class="sr">/enhancer/</span><span class="n">engine</span><span class="o">/</span><span class="n">ner</span>
 </pre></div>
 
 
-<p>When using the Java API enhancement engines can be liked up as OSGI services. The <a href="enhancementenginemanager.html">EnhanceEngineManager</a> service is designed to ease this by providing a API that allows to access enhancement engine by their name.</p>
+<p>When using the Java API enhancement engines can be liked up as OSGI services. The <a href="enhancementenginemanager.html">Enhancement Engine Manager</a> service is designed to ease this by providing a API that allows to access enhancement engine by their name.</p>
 <h2 id="enhancement_engine_interface">Enhancement Engine Interface</h2>
 <p>The interface for enhancement engines contains the following three methods:</p>
 <div class="codehilite"><pre><span class="sr">/** Getter for the value of the &quot;stanbol.enhancer.engine.name&quot; property */</span>
@@ -89,35 +89,34 @@
 </pre></div>
 
 
-<p>Each enhancement engine has an name assigned. This is typically provided by the engine configuration and MUST be set as value to the property "stanbol.enhancer.engine.name" in the service registration of the enhancement engine. The getter for the name MUST return the same value as the value set to this property. Enhancement engine implementations will usually get the name by calling</p>
+<p>Each enhancement engine has a name. This is typically provided by the engine configuration and MUST be set as value to the property "stanbol.enhancer.engine.name" in the service registration of the enhancement engine. The getter for the name MUST return the same value as the value set to this property. Enhancement engine implementations will usually get the name by calling:</p>
 <p>this.name = (String)ComponentContext.getProperties(EnhancementEngine.PROPERTY_NAME);</p>
-<p>in the activate method.</p>
-<p>The "canEnahnce(ContentItem ci)" method is used by the <a href="../enhancementjobmanager.html">EnhancementJobManager</a> to check if an engine is able to process a <a href="../contentitem.html">ContentItem</a>. Calling this method MUST NOT change the state of the ContentItem and this method MUST also NOT acquire a write lock on the content item.</p>
-<p>The "computeEnhacements(ContentItem ci)" starts the processing of the parsed ContentItem by the engine. It is expected to change the state of the parsed ContentItem. Engines that support asynchronous processing need to take care to correctly apply read/write locks when reading/writing information from/to the content time. Engines that return ENHANCE_SYNCHRONOUS on calls to canEnhance(..) do not need to use locks. They can trust that they have exclusive read/write access to the content item.</p>
-<p>EnhancementEngiens do have full access to the ContentItem. Theoretically they would be even allowed to delete all metadata as well as all content parts from the parsed ContentItem. However typically the do only</p>
+<p>The <code>canEnahnce(ContentItem ci)</code> method is used by the <a href="../enhancementjobmanager.html">Enhancement Job Manager</a> to check if an engine is able to process a <a href="../contentitem.html">Content Item</a>. Calling this method MUST NOT change the state of the content item and this method MUST also NOT acquire a write lock on the content item.</p>
+<p>The <code>computeEnhacements(ContentItem ci)</code> starts the processing of the parsed content item by the engine. It is expected to change the state of the parsed content item. Engines that support asynchronous processing need to take care to correctly apply read/write locks when reading/writing information from/to the content item. Engines that return <code>ENHANCE_SYNCHRONOUS</code> on calls to <code>canEnhance(..)</code> do not need to use locks. They can trust that they have exclusive read/write access to the content item.</p>
+<p>Enhancement engins do have full access to the content item. Theoretically, they would be even allowed to delete all metadata as well as all content parts from the parsed content item. However typically the do only</p>
 <ul>
-<li>read existing ContentParts</li>
-<li>add new ContentParts</li>
-<li>add new Enhancements to the metadata</li>
+<li>read existing content parts</li>
+<li>add new content parts</li>
+<li>add new enhancements to the metadata</li>
 <li>some engines might also need to update/delete existing metadata.</li>
 </ul>
-<p>Both the "canEnhance(..)" and "computeEnhancements(..)" methods MUST be called by the <a href="../enhancementjobmanager.html">EnhancementJobManager</a> after all the executions of all enhancement engines this one depends on are completed. This dependencies are defined by the <a href="../chains/executionplan.html">ExecutionPlan</a> used by the EnhancementJobManager to enhance the ContentItem. Implementors of enhancement engines can therefore trust that all metadata expected to be added by other enhancement engines are already present within the metadata of the parsed ContentItems when "canEnhance(..)" or "computeEnhancements(..)" is called.</p>
-<h3 id="servicesproperties_interface">ServicesProperties Interface</h3>
+<p>Both the <code>canEnhance(..)</code> and <code>computeEnhancements(..)</code> methods MUST be called by the <a href="../enhancementjobmanager.html">Enhancement Job Manager</a> after all the executions of all enhancement engines this one depends on are completed. This dependencies are defined by the <a href="../chains/executionplan.html">Execution Plan</a> used by the enhancement job manager to enhance the content item. Implementors of enhancement engines can therefore trust that all metadata expected to be added by other enhancement engines are already present within the metadata of the parsed content items when <code>canEnhance(..)</code> or <code>computeEnhancements(..)</code> is called.</p>
+<h3 id="services_properties_interface">Services Properties Interface</h3>
 <p>This interface is implemented by most of the current enhancement engines. It allows engines to expose additional properties to other component. This interface defines a single method</p>
 <div class="codehilite"><pre><span class="sr">/** Getter for the ServiceProperties */</span>
 <span class="n">Map</span><span class="sr">&lt;String,Object&gt;</span> <span class="n">getServiceProperties</span><span class="p">();</span>
 </pre></div>
 
 
-<p>but also predefines the property ENHANCEMENT_ENGINE_ORDERING = "org.apache.stanbol.enhancer.engine.order" that can be used by enhancement engine implementations to specify their typical ordering within the enhancement process.</p>
+<p>but also predefines the property <code>ENHANCEMENT_ENGINE_ORDERING = "org.apache.stanbol.enhancer.engine.order"</code> that can be used by enhancement engine implementations to specify their typical ordering within the enhancement process.</p>
 <h3 id="engine_ordering_information">Engine Ordering Information</h3>
-<p>By implementing the ServicesProperties interface enhancement engines do have the possibility to expose additional metadata to other components. The ServicesProperties interface defines only a single method</p>
+<p>By implementing the ServicesProperties interface enhancement engines do have the possibility to expose additional metadata to other components. The services properties interface defines only a single method</p>
 <div class="codehilite"><pre><span class="sr">/** Getter for the ServiceProperties */</span>
 <span class="n">Map</span><span class="sr">&lt;String,Object&gt;</span> <span class="n">getServiceProperties</span><span class="p">();</span>
 </pre></div>
 
 
-<p>and is implemented by most of the current enhancement engines. Its currently only use is to provide information about the engine ordering within the enhancement process. This information is exposed by using the key "org.apache.stanbol.enhancer.engine.order" that is defined as value by the constant ENHANCEMENT_ENGINE_ORDERING defined directly by the ServicesProperties interface. Values are expected to be integer within the ranges </p>
+<p>and is implemented by most of the current enhancement engines. Its currently only use is to provide information about the engine ordering within the enhancement process. This information is exposed by using the key "org.apache.stanbol.enhancer.engine.order" that is defined as value by the constant <code>ENHANCEMENT_ENGINE_ORDERING</code> defined directly by the services properties interface. Values are expected to be integer within the ranges </p>
 <ul>
 <li><strong>ORDERING_PRE_PROCESSING</strong>: All values &gt;= 200 are considered for engines that do some kind of preprocessing of the Content. This includes e.g. the conversation of media formats such as extracting the plain text from HTML, keyframes from videos, wave form from mp3 ...; extracting metadata directly encoded within the parsed content such as ID3 tags from MP3 or RDFa, microdata provided by HTML content.</li>
 <li><strong>ORDERING_CONTENT_EXTRACTION</strong>: This range includes values form &lt; 200 and &gt;= 100 and shall be used by enhancement engine that need to analyze the parsed content to extract additional metadata. Examples would be Language detection, Natural Language Processing, Named Entity Recognition, Face Detection in Images, Speech to text …</li>
@@ -126,12 +125,12 @@
 <li><strong>ORDERING_POST_PROCESSING</strong>: This range includes valued form &lt; 0 and &gt;= -100 and is intended to be used by all enhancement engines that do post processing of enhancement results such as schema translation, filtering of Enhancements ...<br />
 </li>
 </ul>
-<p>The Engine Ordering information as described here are used by the <a href="../chains/defaultchain.html">DefaultChain</a> and the <a href="../chains/weightedchain.html">WeightedChain</a> to calculate the <a href="../chains/executionplan.html">ExecutionPlan</a>.</p>
-<p>Basically this features allows the implementor of an enhancement engine to define the correct position of his engine within an typical enhancement chain and therefore ensure that users that add this engine to a Stanbol Enhancer installation to immediately use this engine with the <a href="../chains/defaultchain.html">DefaultChain</a>.</p>
-<p>However the Engine Ordering is not the only possibility for users to control the execution order. Enhancement chain implementations such as the <a href="../chains/listchain.html">ListChain</a> and the <a href="../chains/graphchain.html">GraphChain</a> do also allow to directly define the oder of execution. For this chains the ordering information provided by EnhancementEngines are ignored.</p>
+<p>The engine ordering information as described here are used by the <a href="../chains/defaultchain.html">Default Chain</a> and the <a href="../chains/weightedchain.html">Weighted Chain</a> to calculate the <a href="../chains/executionplan.html">Execution Plan</a>.</p>
+<p>Basically this features allows the implementor of an enhancement engine to define the correct position of his engine within an typical enhancement chain and therefore ensure that users that add this engine to a enhancer installation to immediately use this engine with the <a href="../chains/defaultchain.html">Default Chain</a>.</p>
+<p>However, the engine ordering is not the only possibility for users to control the execution order. Enhancement chain implementations such as the <a href="../chains/listchain.html">List Chain</a> and the <a href="../chains/graphchain.html">Graph Chain</a> do also allow to directly define the oder of execution. For these chains the ordering information provided by enhancement engines are ignored.</p>
 <h2 id="enhancement_engine_management">Enhancement Engine Management</h2>
-<p>This section describes how enhancement engines are managed by the Stanbol Enhancer and how they can be selected/accessed by the <a href="../enhancementjobmanager.html">EnhancementJobManager</a> execution a <a href="../chains/enhancementchain.html">Chain</a>.</p>
-<p>Enhancement engines are registered as OSGI services and managed by using the following service properties:</p>
+<p>This section describes how enhancement engines are managed by the Apache Stanbol Enhancer and how they can be selected/accessed through the <a href="../enhancementjobmanager.html">Enhancement Job Manager</a> and executed in an <a href="../chains/enhancementchain.html">Enhancement Chain</a>.</p>
+<p>Enhancement engines are registered as OSGi services and managed by using the following service properties:</p>
 <ul>
 <li><strong>Name:</strong> Defined by the value of the property "stanbol.enhancer.engine.name" it will be used to access Engines on the Stanbol RESTful interface</li>
 <li><strong>Service Ranking:</strong> The service ranking property defined by OSGI will be used to decide which engine to use in case several active enhancement engines do use the same name. In such cases only the Engine with the highest ranking will be used to enhance ContentItems.</li>
@@ -142,21 +141,21 @@
 _TODO:_ To correctly construct this graph the Engine needs to know this URL. This could e.g. be provided by some OSGI environment parameter set by the JerseyApplication. As an alternative we could also parse this URI as an parameter to the getEngineConfig method.
 -->
 
-<p>Other components such as enhancement Chains do refer to engines by their name. The actual enhancement engine instance is only looked up shortly before the execution.</p>
+<p>Other components such as enhancement chains do refer to engines by their name. The actual enhancement engine instance is only looked up shortly before the execution.</p>
 <h3 id="enhancement_engine_name_conflicts">Enhancement Engine Name Conflicts</h3>
-<p>As enhancement engines are identified by the value of the "stanbol.enhancer.engine.name" property - the name - there might be cases where multiple enhancement engine are registered for the same name. In such cases the normal OSGI procedure to select the default service instance of several possible matches is used. This means that</p>
+<p>As enhancement engines are identified by the value of the "stanbol.enhancer.engine.name" property - the name - there might be cases where multiple enhancement engine are registered for the same name. In such cases the normal OSGi procedure to select the default service instance of several possible matches is used. This means that</p>
 <ol>
 <li>the enhancement engine with the highest "service.ranking" and</li>
 <li>the enhancement engine with the lowest "service.id"</li>
 </ol>
 <p>will be selected on requests for a enhancement engine with a given name. Requests on the RESTful service API will always answer with the enhancement engine selected as default. When using the Java API there are also means to retrieve all enhancement engines for a given name via the <a href="enhancementenginemanager.html">Enhancement Engine Manager</a> interface.</p>
-<p>Out of a user perspective there is one major use case for configuring multiple enhancement engines for the same name. This is to allow the definition of fallback engines if the main one becomes unavailable. e.g. lets assume that a user has a local cache of geonames.org loaded into the Entityhub and configures an <a href="keywordlinkingengine.html">Named Entity Linking</a> engine to perform semantic lifting of extracted locations. However Stanbol also provides the <a href="geonamesengine.html">geonames.org Engine</a> that provides a similar functionality by directly accessing <a href="http://geonames.org">geonames.org</a>. By configuring both engines for the same name, but specifying a higher service ranking for the one using the local cache one can ensure that the local cache is used for the enhancement under normal circumstances. However in case the local cache becomes unavailable the other engine using the remote service will be used for enhancement.</p>
+<p>Out of a user perspective there is one major use case for configuring multiple enhancement engines for the same name. This is to allow the definition of fallback engines if the main one becomes unavailable. e.g. lets assume that a user has a local cache of geonames.org loaded into the <a href="../../entityhub/">Entity Hub</a> and configures an <a href="keywordlinkingengine.html">Named Entity Linking</a> engine to perform semantic lifting of extracted locations. However Apache Stanbol also provides the <a href="geonamesengine.html">geonames.org Engine</a> that provides a similar functionality by directly accessing <a href="http://geonames.org">geonames.org</a>. By configuring both engines for the same name, but specifying a higher service ranking for the one using the local cache one can ensure that the local cache is used for the enhancement under normal circumstances. However in case the local cache becomes unavailable the other engine using the remote service will be us
 ed for enhancement.</p>
 <h3 id="enhancement_engine_manager_interface">Enhancement Engine Manager Interface</h3>
 <p>The <a href="enhancementenginemanager.html">Enhancement Engine Manager</a> is the management interface for enhancement engines that can be used by components to lookup enhancement engines based on their name. There is also OSGI ServiceTracker like implementation that can be used to track only enhancement engines registered for a specific set of names. </p>
 <h2 id="enhancement_engine_implementations">Enhancement Engine Implementations</h2>
 <p>A list of enhancement engine implementations maintained directly by the Apache Stanbol community can be found <a href="../../engines.html">here</a>.
 However the enhancement engine interface is designed in a way that it should be possible for advanced Apache Stanbol users to implement own enhancement engine implementations fulfilling their special needs.</p>
-<p>The Stanbol Community would be very happy if users decide to share thoughts about possible enhancement engines or even would like to contribute addition engines to the Apache Stanbol project.</p>
+<p>The Apache Stanbol community would be very happy if users decide to share thoughts about possible enhancement engines or even would like to contribute addition engines to the Apache Stanbol project.</p>
   </div>
   
   <div id="footer">