You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by bu...@apache.org on 2012/02/17 12:09:22 UTC

svn commit: r805169 - in /websites/staging/stanbol/trunk: ./ content/stanbol/docs/trunk/enhancer/enhancerrest.html

Author: buildbot
Date: Fri Feb 17 11:09:22 2012
New Revision: 805169

Log:
Staging update by buildbot for stanbol

Modified:
    websites/staging/stanbol/trunk/   (props changed)
    websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.html

Propchange: websites/staging/stanbol/trunk/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Feb 17 11:09:22 2012
@@ -1 +1 @@
-1245386
+1245391

Modified: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.html (original)
+++ websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.html Fri Feb 17 11:09:22 2012
@@ -70,7 +70,7 @@
 <p>The content to analyze should be sent in a POST request with the mimetype specified in
 the <code>Content-type</code> header. The response will hold the RDF enhancement serialized in the format specified in the <code>Accept</code> header:</p>
 <div class="codehilite"><pre>curl -X POST -H <span class="s2">&quot;Accept: text/turtle&quot;</span> -H <span class="s2">&quot;Content-type: text/plain&quot;</span> <span class="se">\</span>
-     --data <span class="s2">&quot;John Smith was born in London.&quot;</span> <span class="k">${</span><span class="nv">it</span><span class="p">.serviceUrl</span><span class="k">}</span>
+     --data <span class="s2">&quot;John Smith was born in London.&quot;</span> http://localhost:8080/enhancer
 </pre></div>
 
 
@@ -89,17 +89,16 @@ text/rdf+nt (N-TRIPLES)
 <li><strong>uri={content-item-uri}:</strong> By default the URI of the content item being enhanced is a local, non de-referencable URI automatically built out of a hash digest of the binary content. Sometimes it might be helpful to provide the URI of the <a href="contentitem.html">ContentItem</a> to be used in the enhancements RDF graph.</li>
 <li><strong>executionmetadata=true/false:</strong> Allows the include of <a href="executionmetadata.html">execution metadata</a> in the enhancement metadata of the response. Such data include also the <a href="chains/executionplan.html">execution plan</a> used to enhance the parsed content. This information is typically only useful to clients that want to know how the parsed content was processed by the enhancer. NOTE that the execution metadata can also be requested by using the multi-part content item API described below.</li>
 </ul>
-<p>The following example shows how to send an enhancement request with a
-custom content item URI that will include the execution metadata in the
-response.</p>
+<p>The following example shows how to send an enhancement request with a custom content item URI that will include the execution metadata in the response.
+In addition this request is directed to the <a href="chains">Enhancement Chain</a> with the name "dbpedia-keyword"</p>
 <div class="codehilite"><pre>curl -X POST -H <span class="s2">&quot;Accept: text/turtle&quot;</span> -H <span class="s2">&quot;Content-type: text/plain&quot;</span> <span class="se">\</span>
     --data <span class="s2">&quot;John Smith was born in London.&quot;</span> <span class="se">\</span>
-    <span class="s2">&quot;${it.serviceUrl}?uri=urn:fise-example-content-item&amp;executionmetadata=true&quot;</span>
+    <span class="s2">&quot;http://localhost:8080/enhancer/chain/dbpedia-keyword?uri=urn:fise-example-content-item&amp;executionmetadata=true&quot;</span>
 </pre></div>
 
 
 <h2 id="multi-part_contentitem_support">Multi-part ContentItem support</h2>
-<p>The multi-part ContentItem extensions to the RESTful API (introduced by <a href="https://issues.apache.org/jira/browse/STANBOL-481">STANBOL-481</a>) are considered an advanced usage of the Stanbol Enhancer. </p>
+<p>The multi-part <code>ContentItem</code> extensions to the RESTful API (introduced by <a href="https://issues.apache.org/jira/browse/STANBOL-481">STANBOL-481</a>) are considered an advanced usage of the Stanbol Enhancer. </p>
 <p>Users will want to use this extensions if they need to:</p>
 <ul>
 <li>parse multiple versions of the content: Most CMS already do have support for converting content to plain text. This API allows to parse both the original AND multiple transcoded versions of the content to the Enhancer.</li>
@@ -131,7 +130,7 @@ Requests that use an <code>Accept: {mime
 </li>
 </ul>
 <h3 id="parsing_multiple_contentparts">Parsing multiple ContentParts</h3>
-<p>Requests to the Stanbol Enahcer with the <code>Content-Type: multipart/from-data</code> are considered to contain a ContentItem serialized as MultiPart MIME. The exact specification of the <a href="contentitem.html#multipart_mime_serialization">MultiPart MIME format for ContentItems</a> is provided by the documentation of the ContentItem.</p>
+<p>Requests to the Stanbol Enahcer with the <code>Content-Type: multipart/from-data</code> are considered to contain a <code>ContentItem</code> serialized as MultiPart MIME. The exact specification of the <a href="contentitem.html#multipart_mime_serialization">MultiPart MIME format for ContentItems</a> is provided by the documentation of the <code>ContentItem</code>.</p>
 <p>The combination of <code>multipart/from-data</code> encoded requests with QueryParameters as described above allow for the usage of <a href="contentitem.html#multipart_mime_serialization">MultiPart MIME format for ContentItems</a> for both request and resonse.</p>
 <h2 id="using_the_multi-part_content_item_restful_api_extensions">Using the multi-part content item RESTful API extensions</h2>
 <p>The following examples show typical usage scenarios of the multi-part content item RESTful API. Note that for better readability the values of the query parameters are not URLEncoded.</p>
@@ -173,8 +172,8 @@ John Smith was born in London.
 </pre></div>
 
 
-<p>Se also the formal specification of the <a href="contentitem.html#multipart_mime_serialization">MultiPart MIME format for ContentItems</a> for ContentItems.</p>
-<h4 id="example_2_directly_return_the_plain_text_version_of_parsed_content">Example 2: Directly return the plain text version of parsed content</h4>
+<p>Se also the formal specification of the <a href="contentitem.html#multipart_mime_serialization">MultiPart MIME format for ContentItems</a> for <code>ContentItem</code>s.</p>
+<h3 id="example_2_directly_return_the_plain_text_version_of_parsed_content">Example 2: Directly return the plain text version of parsed content</h3>
 <p>The using the '<code> omitMetadata=true</code>' together with the "Accept: {requested-content-type}" the multi-part content API allows to directly request the transcoded version of the content with the format {requested-content-type}. </p>
 <div class="codehilite"><pre>curl -v -X POST -H &quot;Accept: text/plain&quot; \
     -H &quot;Content-type: text/html; charset=UTF-8&quot; \
@@ -190,7 +189,7 @@ John Smith was born in London.
 
 <p>To make this work the requested <a href="chains">Enhancement Chain</a> will need to include an engine (e.g. <a href="engines/metaxaengine.html">Metaxa</a>) that supports transcoding the parsed content. If not content with the request type is available the request will answer with a "<code>404 NOT FOUND</code>". </p>
 <p>Note also that because the metadata are omitted by responses to such requests it is also recommended to configure/use a chain that does no further processing on the transcoded content. </p>
-<h4 id="example_3_parse_multiple_content_versions">Example 3: Parse multiple content versions</h4>
+<h3 id="example_3_parse_multiple_content_versions">Example 3: Parse multiple content versions</h3>
 <p>This example will use the "httpmime" part of the Apache commons httpcomponents to create the Multipart MIME sent to the Stanbol enhancer.</p>
 <div class="codehilite"><pre><span class="nt">&lt;dependency&gt;</span>
     <span class="nt">&lt;groupId&gt;</span>org.apache.httpcomponents<span class="nt">&lt;/groupId&gt;</span>
@@ -243,10 +242,10 @@ John Smith was born in London.
 
 
 <p>Note that for such requests <a href="engines/metaxaengine.html">Metaxa</a> will still try to extract metadata of the parsed MS Word document, but all other engines will use the plain text version as parsed by the request for processing.</p>
-<h4 id="example_4_parse_existing_free_text_annotations">Example 4: Parse existing free text annotations</h4>
+<h3 id="example_4_parse_existing_free_text_annotations">Example 4: Parse existing free text annotations</h3>
 <p>This example shows how the multi-part content item API can be used to parse already existing tags for an parsed content to the Stanbol Enhancer. For this example it is important to understand that parsed metadata need to confirm to the Stanbol Enhancement Structure. Because of that this example consist of two main steps:</p>
 <ol>
-<li>Convert user tags to TextAnnotations</li>
+<li>Convert user tags to <code>TextAnnotation</code>s</li>
 <li>Send existing Metadata along with the Content to the Stanbol Enhancer</li>
 </ol>
 <p>Also note that the code snippets will uses utilities provided by the "org.apache.stannbol.enhancer.servicesapi" module. As RDF framework Clerezza is used. Both dependencies are easily replaceable.</p>
@@ -273,7 +272,7 @@ John Smith was born in London.
 </pre></div>
 
 
-<p>Now we can convert the information to TextAnnoations</p>
+<p>Now we can convert the User Tags to <code>TextAnnotation</code>s</p>
 <div class="codehilite"><pre><span class="c1">//first create a URI for the text annotation. Here we use a random URN</span>
 <span class="c1">//If you can create a meaningful URI this would be better!</span>
 <span class="n">UriRef</span> <span class="n">ta</span> <span class="o">=</span> <span class="k">new</span> <span class="n">UriRef</span><span class="o">(</span><span class="s">&quot;urn:user-annotation:&quot;</span><span class="o">+</span><span class="n">EnhancementEngineHelper</span><span class="o">.</span><span class="na">randomUUID</span><span class="o">());</span>
@@ -296,7 +295,7 @@ John Smith was born in London.
 </pre></div>
 
 
-<p>Now the 'graph' contains a valid TextAnnotation for the given user tag. This should be done for all tags of the current content.</p>
+<p>Now the 'graph' contains a valid <code>TextAnnotation</code> for the given user tag. This should be done for all tags of the current content.</p>
 <p>In the next step we need to serialize the RDF data. Again I will use here Clerezza as API, but any RDF framework will provide similar functionality</p>
 <div class="codehilite"><pre><span class="n">ByteArrayOutputStream</span> <span class="n">out</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ByteArrayOutputStream</span><span class="o">();</span>
 <span class="c1">//this tells the Serializer to create &quot;application/rdf+xml&quot;</span>
@@ -323,7 +322,7 @@ John Smith was born in London.
 </pre></div>
 
 
-<p>Note that because the <code>StringBody</code> class provided my the "httpmime" framework does not set a Filename we need to override this method and return the URI of the content item. This is essential, because we need ensure that the URI of the ContentItem is the same as the URI (variable '<code>ciUri</code>') as used when creating the TextAnnotations for the user tags.</p>
+<p>Note that because the <code>StringBody</code> class provided my the "httpmime" framework does not set a Filename we need to override this method and return the URI of the content item. This is essential, because we need ensure that the URI of the <code>ContentItem</code> is the same as the URI (variable '<code>ciUri</code>') as used when creating the <code>TextAnnotation</code>s for the user tags.</p>
 <p>For the following code snippet note that we can directly add the content to the content item container. Only if we would need to sent multiple alternate content versions (as shown in 'Example 3') the usage of an <code>'multipart/alternate'</code> container is required.</p>
 <div class="codehilite"><pre><span class="c1">//Add the content as second mime part</span>
 <span class="n">contentItem</span><span class="o">.</span><span class="na">addPart</span><span class="o">(</span>