You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by bu...@apache.org on 2012/02/15 11:25:29 UTC

svn commit: r804901 - in /websites/staging/stanbol/trunk: ./ content/stanbol/docs/trunk/enhancer/index.html

Author: buildbot
Date: Wed Feb 15 10:25:29 2012
New Revision: 804901

Log:
Staging update by buildbot for stanbol

Modified:
    websites/staging/stanbol/trunk/   (props changed)
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html

Propchange: websites/staging/stanbol/trunk/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Wed Feb 15 10:25:29 2012
@@ -1 +1 @@
-1244417
+1244426

Modified: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html (original)
+++ websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/index.html Wed Feb 15 10:25:29 2012
@@ -20,7 +20,7 @@
 -->
 
   <link href="/stanbol/css/stanbol.css" rel="stylesheet" type="text/css">
-  <title>Apache Stanbol - Enhancer</title>
+  <title>Apache Stanbol - Stanbol Enhancer</title>
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
   <link rel="icon" type="image/png" href="/stanbol/images/stanbol-logo/stanbol-favicon.png"/>
 </head>
@@ -56,15 +56,61 @@
   </div>
   
   <div id="content">
-    <h1 class="title">Enhancer</h1>
-    <p>This stateless interface allows the caller to submit content to the Apache Stanbol <a href="engines/">enhancer engines</a> and get the resulting enhancements formatted as RDF at once without storing anything on the server-side.</p>
+    <h1 class="title">Stanbol Enhancer</h1>
+    <p>The Apache Stanbol Enhancer provides both a RESTful and a Java API that allows caller to Extract features from parsed Content. In more detail the parsed Content is processed by <a href="engines">enhancement engines</a> as defined by the called <a href="chains/enhancementchain.html">Enhancement Chain</a>.</p>
+<h2 id="using_the_stanbol_enhancer">Using the Stanbol Enhancer</h2>
+<p>The following figure provides an overview about the RESTful as well as the Java API provided by the Stanbol Enhancer</p>
+<p><img alt="Stanbol Enhancer Overview" src="enhanceroverview.png" title="Overview about the RESTful Services and Java API provided by the Stanbol Enhancer" /></p>
+<h3 id="restful_service">RESTful service:</h3>
 <p>The content to analyze should be sent in a POST request with the mimetype specified in the Content-type header. The response will hold the RDF enhancement serialized in the format specified in the Accept header:</p>
 <div class="codehilite"><pre><span class="n">curl</span> <span class="o">-</span><span class="n">X</span> <span class="n">POST</span> <span class="o">-</span><span class="n">H</span> <span class="s">&quot;Accept: text/turtle&quot;</span> <span class="o">-</span><span class="n">H</span> <span class="s">&quot;Content-type: text/plain&quot;</span> <span class="o">\</span>
-<span class="o">--</span><span class="n">data</span> <span class="s">&quot;John Smith was born in London.&quot;</span> <span class="n">http:</span><span class="sr">//</span><span class="n">localhost:8080</span><span class="o">/</span><span class="n">engines</span>
+<span class="o">--</span><span class="n">data</span> <span class="s">&quot;John Smith was born in London.&quot;</span> <span class="n">http:</span><span class="sr">//</span><span class="n">localhost:8080</span><span class="o">/</span><span class="n">enhancer</span>
+</pre></div>
+
+
+<p>See the documentation provided by the Stanbol Web UI (e.g. "http://localhost:8080/enhancer" assuming that Apache Stanbol runs on localhost:8080)</p>
+<h3 id="java_api">Java API:</h3>
+<p>The usage of the Java API requires the following OSGI Services</p>
+<div class="codehilite"><pre><span class="nv">@Reference</span>
+<span class="n">EnhancementJobManager</span> <span class="n">jobManager</span>
+<span class="nv">@Reference</span>
+<span class="n">EnhancementChainManager</span>
+</pre></div>
+
+
+<p>Provided this service are available the following code snippet shows how to enhance a Content</p>
+<div class="codehilite"><pre><span class="n">InputStream</span> <span class="n">content</span><span class="p">;</span> <span class="sr">//</span><span class="n">the</span> <span class="n">content</span> <span class="p">(</span><span class="n">assuming</span> <span class="n">an</span> <span class="n">HTML</span> <span class="n">document</span><span class="p">)</span>
+<span class="n">String</span> <span class="n">chainName</span><span class="p">;</span> <span class="sr">//</span><span class="n">the</span> <span class="n">name</span> <span class="n">of</span> <span class="n">the</span> <span class="n">chain</span> <span class="ow">or</span> <span class="n">null</span> <span class="n">to</span> <span class="k">use</span> <span class="n">the</span> <span class="n">default</span>
+<span class="n">ContentItem</span> <span class="n">contentItem</span> <span class="o">=</span> <span class="k">new</span> <span class="n">InMemoryContentItem</span><span class="p">(</span>
+    <span class="n">IOUtils</span><span class="o">.</span><span class="n">toByteArray</span><span class="p">(</span><span class="n">content</span><span class="p">),</span> <span class="s">&quot;text/html; charset=UTF-8&quot;</span><span class="p">);</span>
+<span class="sr">//g</span><span class="n">et</span> <span class="n">the</span> <span class="n">EnhancementChain</span>
+<span class="n">Chain</span> <span class="n">enhancementChain</span><span class="p">;</span>
+<span class="k">if</span><span class="p">(</span><span class="n">chainName</span> <span class="o">==</span> <span class="n">null</span><span class="p">){</span>
+    <span class="n">enhancementChain</span> <span class="o">=</span> <span class="n">chainManager</span><span class="o">.</span><span class="n">getDefault</span><span class="p">();</span>
+<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
+    <span class="n">enhancementChain</span> <span class="o">=</span> <span class="n">chainManager</span><span class="o">.</span><span class="n">getChain</span><span class="p">(</span><span class="n">chainName</span><span class="p">);</span>
+<span class="p">}</span>
+<span class="n">try</span> <span class="p">{</span> <span class="sr">//</span><span class="n">enhance</span> <span class="n">the</span> <span class="n">content</span>
+    <span class="n">jobManager</span><span class="o">.</span><span class="n">enhanceContent</span><span class="p">(</span><span class="n">contentItem</span><span class="p">,</span> <span class="n">enhancementChain</span><span class="p">);</span>
+<span class="p">}</span> <span class="n">catch</span> <span class="p">(</span><span class="n">EnhancementException</span> <span class="n">e</span><span class="p">)</span> <span class="p">{}</span>
+
+<span class="sr">//</span><span class="n">Get</span> <span class="n">the</span> <span class="n">enhancement</span> <span class="n">Results</span>
+<span class="n">MGraph</span> <span class="n">enhancements</span> <span class="o">=</span> <span class="n">contentItem</span><span class="o">.</span><span class="n">getMetadata</span><span class="p">();</span>
+</pre></div>
+
+
+<p>However the ContentIem may - depending on the executed <a href="engines">Enhancement Engines</a> also provide additional information. This shows how to retrieve the text version of the parsed HTML content.</p>
+<div class="codehilite"><pre><span class="n">Entry</span><span class="sr">&lt;UriRef,Blob&gt;</span> <span class="n">textContentPart</span> <span class="o">=</span> 
+        <span class="n">ContentItemHelper</span><span class="o">.</span><span class="n">getBlob</span><span class="p">(</span><span class="n">contentItem</span><span class="p">,</span> 
+            <span class="n">Collections</span><span class="o">.</span><span class="n">singleton</span><span class="p">(</span><span class="s">&quot;text/plain&quot;</span><span class="p">));</span>
+<span class="n">Blob</span> <span class="n">testBlob</span> <span class="o">=</span> <span class="n">textContentPart</span><span class="o">.</span><span class="n">getValue</span><span class="p">();</span>
+<span class="n">String</span> <span class="n">charset</span> <span class="o">=</span> <span class="n">testBlob</span><span class="o">.</span><span class="n">getParameter</span><span class="p">()</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s">&quot;charset&quot;</span><span class="p">);</span>
+<span class="n">String</span> <span class="n">plainText</span> <span class="o">=</span> <span class="n">IOUtils</span><span class="o">.</span><span class="n">toString</span><span class="p">(</span>
+    <span class="n">textContentPart</span><span class="o">.</span><span class="n">getValue</span><span class="p">()</span><span class="o">.</span><span class="n">getStream</span><span class="p">(),</span>
+    <span class="n">charset</span> <span class="o">==</span> <span class="n">null</span> <span class="p">?</span> <span class="s">&quot;UTF-8&quot;</span> <span class="p">:</span> <span class="n">charset</span><span class="p">);</span>
 </pre></div>
 
 
-<p>The list of mimetypes accepted as inputs depends on the deployed engines. By default only text/plain content will be analyzed.</p>
 <h2 id="list_of_available_enhancement_engines">List of Available Enhancement Engines</h2>
 <p>Apache Stanbol comes with a <a href="engines/list.html">list of predefined enhancement engines</a>. These engines are supported by the Apache Stanbol community. If you would like to implement your own enhancement engine, you should go on reading this documentation.</p>
 <h2 id="main_interfaces_and_utilities">Main Interfaces and Utilities</h2>