You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by bu...@apache.org on 2012/06/27 09:59:16 UTC

svn commit: r823432 - in /websites/staging/stanbol/trunk/content: ./ stanbol/ stanbol/docs/trunk/ stanbol/docs/trunk/enhancer/ stanbol/docs/trunk/images/

Author: buildbot
Date: Wed Jun 27 07:59:14 2012
New Revision: 823432

Log:
Staging update by buildbot for stanbol

Added:
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancer-overview.png   (with props)
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.png   (with props)
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.svg   (with props)
Removed:
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-components.svg
Modified:
    websites/staging/stanbol/trunk/content/   (props changed)
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/components.html
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contentenhancement.html
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-components.png
    websites/staging/stanbol/trunk/content/stanbol/overview.html

Propchange: websites/staging/stanbol/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Wed Jun 27 07:59:14 2012
@@ -1 +1 @@
-1354350
+1354355

Modified: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/components.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/docs/trunk/components.html (original)
+++ websites/staging/stanbol/trunk/content/stanbol/docs/trunk/components.html Wed Jun 27 07:59:14 2012
@@ -82,17 +82,22 @@
   </div>
   <div id="content">
     <h1 class="title">Apache Stanbol Components</h1>
-    <p>Apache Stanbol is built as a modular set of components. Each component is accessible via its own RESTful web interface. From this viewpoint, all Apache Stanbol features can be used via RESTful service calls. The components are implemented as <a href="http://www2.osgi.org/Specifications/HomePage">OSGi</a> components based on <a href="http://felix.apache.org">Apache Felix</a>.</p>
-<p>This page gives an overview of the major features of various Apache Stanbol components. Figure 1 depicts the main Apache Stanbol components and their arrangement within the Apache Stanbol architecture. Additionally, we have documented some <a href="scenarios.html">usage scenarios</a>.</p>
-<p><img alt="Apache Stanbol Components" src="images/stanbol-components.png" title="Apache Stanbol Components" />
-<figcaption>Figure 1: Apache Stanbol Components</figcaption></p>
-<p>We will shortly describe the components from top to bottom and link to their detailed descriptions.</p>
+    <p style="text-align: center;">![Apache Stanbol Components](images/stanbol-components.png "Overview on the different Components included by Apache Stanbol")
+<figcaption>Figure 1: The Apache Stanbol Components</figcaption></p>
+
+<p>Apache Stanbol is built as a modular set of components. Each component is accessible via its own RESTful web interface. From this viewpoint, all Apache Stanbol features can be used via RESTful service calls. </p>
+<p>Components do not depend on each other. However they can be easily combined if needed as shown by the different <a href="scenarios.html">Usage Scenarios</a>. This ensures that the list of used components depend on the specific usage scenario and not on the Stanbol architecture.</p>
+<p>All components are implemented as <a href="http://www2.osgi.org/Specifications/HomePage">OSGi</a> bundles, components and services. By default Apache Stanbol uses <a href="http://felix.apache.org">Apache Felix</a> as OSGI environment. However generally we try to avoid the use of Felix specific features. If you need to run Stanbol in an other OSGI environment an encounter problems tell us by opening a <a href="https://issues.apache.org/jira/browse/STANBOL">JIRA issue</a> and/or asking about it on the Stanbol Developer <a href="mailinglists.html">mailing list</a>.</p>
+<p>For deployment Stanbol uses the <a href="http://sling.apache.org">Apache Sling</a> launcher. While the Stanbol Community maintains different launcher options including run-able JARs and WAR files we expect users to configure their custom launchers optimized for their usage scenario. However it os also possible to us Stanbol with other launchers (such as <a href="http://karaf.apache.org/">Apache Karaf</a>) or to add its bundles to any existing OSGI environment.</p>
+<p>Figure 2 depicts the main Apache Stanbol components and their arrangement within the Apache Stanbol architecture.
+<img alt="Apache Stanbol Components" src="images/stanbol-architecture.png" title="Apache Stanbol Components" />
+<figcaption>Figure 2: Apache Stanbol Architecture</figcaption></p>
 <ul>
 <li>
 <p>The <a href="enhancer/">Enhancer</a> component together with its <a href="enhancer/engines/list.html">Enhancement Engines</a> provides you with the ability to post content to Apache Stanbol and get suggestions for possible entity annotation in return. The enhancements are provided via natural language processing, metadata extraction and linking named entities to public or private entity repositories. Furthermore, Apache Stanbol provides a machinery to further process this data and add additional knowledge and links via applying rules and reasoning. Technically, the enhancements are stored in a triple-graph that is maintained by <a href="http://incubator.apache.org/clerezza">Apache Clerezza</a>.</p>
 </li>
 <li>
-<p>The 'Sparql endpoint' gives access to the semantic enhancements form the Apache Stanbol <a href="enhancer/">Enhancer</a>.</p>
+<p>The 'Sparql endpoint' gives access to RDF graphs of Apache Stanbol. This especially includes the graph with all Enhancement Results managed by the Stanbol <a href="contenthub/">Contenthub</a>.</p>
 </li>
 <li>
 <p>The 'EnhancerVIE' is a stateful interface to submit content to analyze and store the results on the server. It is then possible to browse the resulting enhanced content items.</p>

Modified: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contentenhancement.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contentenhancement.html (original)
+++ websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contentenhancement.html Wed Jun 27 07:59:14 2012
@@ -82,15 +82,23 @@
   </div>
   <div id="content">
     <h1 class="title">Basic Content Enhancement</h1>
-    <p>For enhancing content you simply post plain text content to the enhancement engines and you will get back enhancement data. The enhancement process is stateless, so neither your content item, nor the enhancements will be stored. </p>
-<p>You can test this via the <a href="http://localhost:8080/enhancer">Web interface</a> of the Apache Stanbol Enhancer - http://{host}:{port}/enhancer or from the console using the CURL command.</p>
+    <p>This Usage scenario will provide you all necessary information for getting started with the Stanbol Enhancer. This includes </p>
+<ul>
+<li>Using the RESTful API of the Stanbol Enhancer</li>
+<li>Overview about available Enhancement Engines</li>
+<li>Configuration of the Stanbol Enhancer</li>
+</ul>
+<h2 id="using-the-restful-enhancement-service">Using the RESTful Enhancement service</h2>
+<p>For enhancing content you simply post you content to the Stanbol Enhancer. The Enhancer will use a Chain of Enhancement Engines to process the parsed content and return  extracted features as RDF encoded using the Stanbol Enhancement Structure. The following figure provides an overview on that process.
+<p style="text-align: center;"><img alt="Enhancing Content with the Stanbol Enhancer" src="enhancer/enhancer-overview.png" title="The Stanbol Enhancer uses a Chain of Enhancement Engines to extract Entities from parsed Content and returns results as RDF." /></p></p>
+<p>In case you have a <a href="tutorial.html">local Stanbol Instance</a> you can also test this via the <a href="http://localhost:8080/enhancer">Web interface</a> of the Apache Stanbol Enhancer - http://{host}:{port}/enhancer or from the command line using the CURL command.</p>
 <div class="codehilite"><pre>curl -X POST -H <span class="s2">&quot;Accept: text/turtle&quot;</span> -H <span class="s2">&quot;Content-type: text/plain&quot;</span> <span class="se">\</span>
 --data <span class="s2">&quot;The Stanbol enhancer can detect famous cities such as Paris \</span>
 <span class="s2">and people such as Bob Marley.&quot;</span> http://localhost:8080/engines
 </pre></div>
 
 
-<p>The following script sends the contents of the text-examples folder to the Stanbol Enhancer.</p>
+<p>The following script sends the contents of the text-examples folder to the Stanbol Enhancer. However it could also be used to index the contents of any folder on the local file system. If you want to keep the Enhancement results you can pipe the results of the curl command (e.g. to files)</p>
 <div class="codehilite"><pre><span class="k">for</span> <span class="n">file</span> <span class="n">in</span> <span class="n">enhancer</span><span class="sr">/data/</span><span class="n">text</span><span class="o">-</span><span class="n">examples</span><span class="o">/*.*</span><span class="p">;</span>
 <span class="k">do</span>
     <span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="n">POST</span> <span class="o">-</span><span class="n">H</span> <span class="s">&quot;Accept: text/turtle&quot;</span> <span class="o">-</span><span class="n">H</span> <span class="s">&quot;Content-type: text/plain&quot;</span> <span class="o">\</span>
@@ -102,7 +110,13 @@
 <p>The Apache Stanbol Enhancer can also enhancer non-plain-text files. In this case <a href="http://tika.apache.org">Apache Tika</a> - via the <a href="enhancer/engines/tikaengine.html">Tika Engine</a> is used to extract the plain text from those files (see the <a href="http://tika.apache.org">Apache Tika</a> documentation for supported file formats).</p>
 <h2 id="configuring-and-using-enhancement-chains">Configuring and Using Enhancement Chains</h2>
 <p>The Apache Stanbol Enhancer supports multiple <a href="enhancer/chains">enhancement chains</a>. This feature allows to configure use multiple processing chains for parsed content within the same Apache Stanbol instance.</p>
-<p>Chains are build based on an <a href="enhancer/chains/executionpla.html">execution plan</a> referencing one or more <a href="enhancer/engines">enhancement engines</a> by there name. Users can create and modify enhancement chains by using the <a href="http://localhost:8080/system/console/configMgr">Configuration Tab</a> of the Apache Felix web console - http://{host}:{port}/system/console/configMgr. There are three different implementations: (1) the self sorting <a href="enhancer/chains/weightedchain.html">weighted chain</a>, (2) the <a href="enhancer/chains/listchain.html">list chain</a> and (3) the <a href="enhancer/chains/graphchain.html">graph chain</a> that allows the direct configuration of the execution graph. There is also a (4) <a href="enhancer/chains/defaultchain.html">default chain</a> that includes all currently active enhancement engines. While this engine is enabled by default most users might want to deactivate it as soon as they have configured there own c
 hains.</p>
+<p>Chains are build based on an <a href="enhancer/chains/executionpla.html">execution plan</a> referencing one or more <a href="enhancer/engines">enhancement engines</a> by there name. Users can create and modify enhancement chains by using the <a href="http://localhost:8080/system/console/configMgr">Configuration Tab</a> of the Apache Felix web console - http://{host}:{port}/system/console/configMgr. There are three different implementations: </p>
+<ol>
+<li>the self sorting <a href="enhancer/chains/weightedchain.html">weighted chain</a> </li>
+<li>the <a href="enhancer/chains/listchain.html">list chain</a></li>
+<li>the <a href="enhancer/chains/graphchain.html">graph chain</a> that allows the direct configuration of the execution graph what can allow advanced users to optimize chain execution. </li>
+</ol>
+<p>In addition the Stanbol Enhancer includes the so called <a href="enhancer/chains/defaultchain.html">Default Chain</a> that includes all currently active enhancement engines. While this engine is enabled by default most users might want to deactivate it as soon as they have configured there own chains.</p>
 <p>To configure enhancement engines it is essential to understand the intension of the different <a href="enhancer/engines">enhancement engine</a> implementations. The <a href="enhancer/engines/list.html">list of available enhancement engines</a> managed by the Apache Stanbol community is available <a href="enhancer/engines/list.html">here</a>. See the documentation of the listed engines for detailed information.</p>
 <p>The list groups engines by categories: Preprocessing engines typically perform operations on a content scope. This includes plain-text extraction, metadata extraction, language detection. This is followed by engines that analyses the parsed content. This category currently includes all Natural Language Processing (NLP) related engines but also would include image-, audio- and video- processing. The third category consist of engines that consume extracted features from the content and perform some kind of semantic lifting on it - e.g. linking extracted features with entities/concepts contained in controlled vocabularies. Finally post-processing engines can be used to adjust rankings, filter out unwanted enhancements or do other kind of transformations on the enhancement results.</p>
 <p>A typical text processing enhancement chain might look like that:</p>

Added: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancer-overview.png
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancer-overview.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Modified: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html (original)
+++ websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html Wed Jun 27 07:59:14 2012
@@ -95,13 +95,16 @@
 </li>
 <li><a href="#List_of_Engines">List of Available Enhancement Engines</a></li>
 </ul>
+<p>Reader should note that this is the technical documentation of the Stanbol Enhancer intended for Developer. For more practical - usage case oriented - introduction to the Stanbol Enhancer as well as other components place have look at the available <a href="../scenarios.html">Usage Scenarios</a>.</p>
 <p><a name="Using_Stanbol_Enhancer"></a></p>
 <h2 id="using-the-stanbol-enhancer">Using the Stanbol Enhancer</h2>
-<p>The figure below provides an overview of the RESTful as well as the Java API provided by the Stanbol Enhancer</p>
-<p><img alt="Stanbol Enhancer Overview" src="enhanceroverview-s.png" title="Overview of RESTful Services and Java API provided by the Stanbol Enhancer" /></p>
+<p>The figure below provides an overview of the RESTful as well as the Java API provided by the Stanbol Enhancer
+<p style="text-align: center;"><img alt="Stanbol Enhancer Overview" src="enhanceroverview-s.png" title="Overview of RESTful Services and Java API provided by the Stanbol Enhancer" /></p></p>
 <p><a name="RESTful_API"></a></p>
 <h3 id="restful-api">RESTful API</h3>
-<p>The content to be analyzed should be sent in a POST request with the mime-type specified in the Content-type header. The response will hold the RDF enhancement serialized in the format specified in the Accept header:</p>
+<p>The content to be analyzed should be sent in a POST request with the mime-type specified in the Content-type header. The parsed content is than processed by the targeted <a href="chains">Enhancement Chain</a>. The response will hold the RDF enhancement serialized in the format specified in the Accept header. The following figure visualizes this process.
+<p style="text-align: center;"><img alt="Enhancing Content with the Stanbol Enhancer" src="enhancer/enhancer-overview.png" title="The Stanbol Enhancer uses a Chain of Enhancement Engines to extract Entities from parsed Content and returns results as RDF." /></p></p>
+<p>You can test that easily from the command line using the curl command:</p>
 <div class="codehilite"><pre>curl -X POST -H <span class="s2">&quot;Accept: text/turtle&quot;</span> -H <span class="s2">&quot;Content-type: text/plain&quot;</span> <span class="se">\</span>
     --data <span class="s2">&quot;The Stanbol enhancer can detect famous cities such as \</span>
 <span class="s2">            Paris and people such as Bob Marley.&quot;</span> <span class="se">\</span>

Added: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.png
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.png
------------------------------------------------------------------------------
    svn:mime-type = image/png

Added: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.svg
==============================================================================
Binary file - no diff available.

Propchange: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-architecture.svg
------------------------------------------------------------------------------
    svn:mime-type = image/svg+xml

Modified: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/images/stanbol-components.png
==============================================================================
Binary files - no diff available.

Modified: websites/staging/stanbol/trunk/content/stanbol/overview.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/overview.html (original)
+++ websites/staging/stanbol/trunk/content/stanbol/overview.html Wed Jun 27 07:59:14 2012
@@ -84,8 +84,10 @@
     <h1 class="title">Overview about Apache Stanbol (incubating)</h1>
     <p>Apache Stanbol (currently in <a href="http://incubator.apache.org">incubating</a>) provides a  set of reusable components for semantic content management. For users it is important to note that Stanbol is NOT a semantic CMS by it own. It is designed to provide semantic services for existing content management.
 <p style="text-align: center;"><img alt="Apache Stanbol - The semantic engine" src="images/stanbol-semanticengine.png" title="Apache Stanbol is aimed to bring semantic technologies to current CMS Systems." /></p></p>
-<p>All the features described in the following sections are meant to be accessed over RESTful services. Typically they are use to extend traditional content management systems with semantic services. Other feasible use cases include: Direct usage from web applications (e.g. for Tag extraction/suggestion; or text completion in search fields), 'smart' Content workflows or email routing based on extracted Entities/Topics, ... </p>
+<p>However while Apache Stanbol was build with CMS in mind it can also be used in different usage scenarios including: Direct usage from web applications (e.g. for Tag extraction/suggestion; or text completion in search fields), 'smart' Content workflows or email routing based on extracted Entities/Topics, ...</p>
+<p>The remaining part of this Document provides an overview about Apache Stanbol by means of describing typical usage scenarios.</p>
 <h3 id="content-enhancement">Content Enhancement</h3>
+<p>Extracting information of parsed content is the most common usage of Apache Stanbol. </p>
 <p>The Stanbol Enhancer provides a <a href="docs/trunk/enhancer/enhancerrest.html">RESTful API</a> that allows to <a href="docs/trunk/contentenhancement.html">extract semantic information</a> from parsed Content. 
 <p style="text-align: center;">
 <img alt="Content Enhancement with the Stanbol Enhancer" src="images/stanbol-feature-enhance.png" title="Extract semantic information from parsed Content" />