You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@marmotta.apache.org by bu...@apache.org on 2013/02/27 14:00:35 UTC

svn commit: r852221 [7/7] - in /websites/staging/marmotta/trunk/content: ./ apidocs/ apidocs/org/apache/marmotta/kiwi/caching/ apidocs/org/apache/marmotta/kiwi/caching/class-use/ apidocs/org/apache/marmotta/kiwi/config/ apidocs/org/apache/marmotta/kiwi...

Modified: websites/staging/marmotta/trunk/content/ldclient/modules.html
==============================================================================
--- websites/staging/marmotta/trunk/content/ldclient/modules.html (original)
+++ websites/staging/marmotta/trunk/content/ldclient/modules.html Wed Feb 27 13:00:12 2013
@@ -298,7 +298,18 @@
     <artifactId>ldclient-provider-rdf</artifactId>
     <version>3.0.0-incubating-SNAPSHOT</version>
 </dependency>
-</pre></div><p>The RDF module will issue a direct HTTP request for the requested resource with the Accept header set to the supported RDF formats. It will parse the RDF response into a in-memory Sesame repository and **filter out triples where the subject does not match the requested resource**. The reason for this restriction is that all other triples do not actually represent the requested resource in a Linked Data style and also clutter your local repository unnecessarily (i.e. you only get what you requested and nothing more).</p><p>The RDF module auto-registers an endpoint for all HTTP resources with low priority, i.e. in case no other endpoint configuration matches first, the LDClient library will always try to issue a Linked Data request by default once the module is on the classpath.</p></div><div class="section"><h2>RDFa Module<a name="RDFa_Module"></a></h2><p>The RDFa module (ldclient-provider-rdfa) offers support to parse and retrieve RDF triples contained in an H
 TML document in RDFa format. The data provider uses the <a class="externalLink" href="http://code.google.com/p/rdfa-core-java/">RDFa Core Java</a> library for parsing and therefore only supports well-formed XHTML resources for parsing. To include the RDFa module in your project, add the following dependency to your project build:</p>
+</pre></div><p>The RDF module will issue a direct HTTP request for the requested resource with the Accept header set to the supported RDF formats. It will parse the RDF response into a in-memory Sesame repository and **filter out triples where the subject does not match the requested resource**. The reason for this restriction is that all other triples do not actually represent the requested resource in a Linked Data style and also clutter your local repository unnecessarily (i.e. you only get what you requested and nothing more).</p><p>The RDF module auto-registers an endpoint for all HTTP resources with low priority, i.e. in case no other endpoint configuration matches first, the LDClient library will always try to issue a Linked Data request by default once the module is on the classpath.</p><p>In addition to this default behaviour, the RDF module also offers pre-defined endpoint classes that can be used for configuring special cases:</p>
+<ul>
+  <li>SPARQLEndpoint allows redirecting the request of all resources matching a regular expression pattern to a SPARQL  service; the provider will then issue a SPARQL query of the form <tt>SELECT ?p ?o WHERE { &lt;{uri}&gt; ?p ?o }</tt> to retrieve  all triples having the requested resource as subject</li>
+  <li>StanbolEndpoint allows redirecting the request to a Apache Stanbol Entityhub (which can be configured as a local  cache of frequently-used Linked Data resources).</li>
+</ul><p>For example, to redirect requests to DBPedia resources to the (more reliable) SPARQL endpoint offered by the service, you can use the following statements:</p>
+<div class="source"><pre class="prettyprint">ClientConfiguration config = new ClientConfiguration();
+config.addEndpoint(
+    new SPARQLEndpoint(&quot;DBPedia (SPARQL)&quot;,&quot;http://dbpedia.org/sparql&quot;,&quot;^http://dbpedia\\.org/resource/.*&quot;)
+);
+
+LDClientService ldclient = new LDClient(config);
+</pre></div></div><div class="section"><h2>RDFa Module<a name="RDFa_Module"></a></h2><p>The RDFa module (ldclient-provider-rdfa) offers support to parse and retrieve RDF triples contained in an HTML document in RDFa format. The data provider uses the <a class="externalLink" href="http://code.google.com/p/rdfa-core-java/">RDFa Core Java</a> library for parsing and therefore only supports well-formed XHTML resources for parsing. To include the RDFa module in your project, add the following dependency to your project build:</p>
 <div class="source"><pre class="prettyprint">&lt;dependency&gt;
     &lt;groupId&gt;org.apache.marmotta&lt;/groupId&gt;
     &lt;artifactId&gt;ldclient-provider-rdfa&lt;/artifactId&gt;
@@ -312,27 +323,27 @@
 &lt;/dependency&gt;
 </pre></div><p>The library will automatically register data providers for videos, playlists, and channels and add default endpoint configurations for the most common YouTube URLs (<a class="externalLink" href="http://www.youtube.com/...">http://www.youtube.com/...</a>, <a class="externalLink" href="http://gdata.youtube.com/...">http://gdata.youtube.com/...</a>, <a class="externalLink" href="http://youtu.be/...)">http://youtu.be/...)</a>. This allows you to request any YouTube URL as soon as you have the library on your classpath.</p><p>Note that to avoid ambiguities, the actual video metadata will only be returned when requesting the video using the URI of the video starting with <a class="externalLink" href="http://youtu.be/">http://youtu.be/</a>. All other URIs of the video will only return a triple linking to the youtu.be video with a <tt>foaf:primaryTopic</tt> relation.</p><p>Using the <a class="externalLink" href="http://www.w3.org/TR/mediaont-10/">Ontology for Media Re
 sources</a>, the module will create the following triples when requesting a video resource:</p>
 <ul>
-  <li><tt>rdf:type</tt> will be <tt>ma:MediaResource</tt> and <tt>ma:VideoTrack</tt></li>
-  <li><tt>ma:title</tt> will be the title of the video on YouTube</li>
-  <li><tt>ma:locator</tt> will be the URL of the actual video on the platform</li>
-  <li><tt>ma:hasCreator</tt> and <tt>ma:hasPublisher</tt> will point to the resource URI of the user who uploaded the video</li>
-  <li><tt>ma:date</tt> will be set to the publication date of the video</li>
-  <li><tt>ma:locationLatitude</tt> and <tt>ma:locationLongitude</tt> will be set to the geo-coordinates of the video (if available)</li>
-  <li><tt>ma:description</tt> will contain the textual description provided by the publisher</li>
-  <li><tt>ma:hasKeyword</tt> will contain the names of all YouTube keywords associated with the video</li>
-  <li><tt>ma:hasGenre</tt> will point to the resources identifying the YouTube categories of the video</li>
-  <li><tt>ma:hasRating</tt> will contain the average user rating of the video at the time of retrieval</li>
-  <li><tt>ma:copyright</tt> will describe the license terms of the video</li>
+  <li><tt>rdf:type</tt> is <tt>ma:MediaResource</tt> and <tt>ma:VideoTrack</tt></li>
+  <li><tt>ma:title</tt> is the title of the video on YouTube</li>
+  <li><tt>ma:locator</tt> is the URL of the actual video on the platform</li>
+  <li><tt>ma:hasCreator</tt> and <tt>ma:hasPublisher</tt> point to the resource URI of the user who uploaded the video</li>
+  <li><tt>ma:date</tt> is set to the publication date of the video</li>
+  <li><tt>ma:locationLatitude</tt> and <tt>ma:locationLongitude</tt> is set to the geo-coordinates of the video (if available)</li>
+  <li><tt>ma:description</tt> contains the textual description provided by the publisher</li>
+  <li><tt>ma:hasKeyword</tt> contains the names of all YouTube keywords associated with the video</li>
+  <li><tt>ma:hasGenre</tt> points to the resources identifying the YouTube categories of the video</li>
+  <li><tt>ma:hasRating</tt> contains the average user rating of the video at the time of retrieval</li>
+  <li><tt>ma:copyright</tt> describes the license terms of the video</li>
   <li><tt>ma:hasCompression</tt> and <tt>ma:format</tt> will describe the content format used for storing the video</li>
-  <li><tt>ma:duration</tt> will contain the duration of the video in seconds</li>
-  <li><tt>foaf:thumbnail</tt> will contain references to thumbnail images of the video</li>
-  <li><tt>sioc:num_views</tt> will contain the number of views of the video at the time of retrieval</li>
+  <li><tt>ma:duration</tt> contains the duration of the video in seconds</li>
+  <li><tt>foaf:thumbnail</tt> contains references to thumbnail images of the video</li>
+  <li><tt>sioc:num_views</tt> contains the number of views of the video at the time of retrieval</li>
 </ul><p>Similarly, channels and playlists are represented using the Media Collections offered by the Ontology for Media Resources:</p>
 <ul>
-  <li><tt>rdf:type</tt> will be <tt>ma:Collection</tt></li>
-  <li><tt>ma:collectionName</tt> will be the title of the channel or playlist</li>
+  <li><tt>rdf:type</tt> is <tt>ma:Collection</tt></li>
+  <li><tt>ma:collectionName</tt> is the title of the channel or playlist</li>
   <li><tt>ma:hasMember</tt> points to the resource URIs of all videos contained in the channel or plalist</li>
-  <li><tt>ma:hasCreator</tt> and <tt>ma:hasPublisher</tt> will point to the resource URI of the user who uploaded the video</li>
+  <li><tt>ma:hasCreator</tt> and <tt>ma:hasPublisher</tt> points to the resource URI of the user who uploaded the video</li>
 </ul></div><div class="section"><h2>Vimeo Module<a name="Vimeo_Module"></a></h2><p>Similar to the YouTube module, the Vimeo module allows you to access Vimeo videos and groups (including channels) as if they were Linked Data resources. When requesting metadata about a resource, it redirects the request to the Vimeo API, processes the result, and maps the proprietary properties to RDF using the <a class="externalLink" href="http://www.w3.org/TR/mediaont-10/">Ontology for Media Resources</a>. It is thus possible to access Vimeo and YouTube videos in the same way. To use the Vimeo module in your project, add the following dependency to your build:</p>
 <div class="source"><pre class="prettyprint">&lt;dependency&gt;
     &lt;groupId&gt;org.apache.marmotta&lt;/groupId&gt;
@@ -341,20 +352,20 @@
 &lt;/dependency&gt;
 </pre></div><p>The library will automatically register endpoints with default configurations for Vimeo videos, groups and channels accessible via the main Vimeo website (<a class="externalLink" href="http://vimeo.com)">http://vimeo.com)</a>. With this configuration you can directly retrieve metadata about Vimeo resources using the LDClient, no further configuration is needed.</p><p>Using the <a class="externalLink" href="http://www.w3.org/TR/mediaont-10/">Ontology for Media Resources</a>, the module will create the following triples when requesting a video resource. Note that the Vimeo API provides significantly less details than the YouTube API:</p>
 <ul>
-  <li><tt>rdf:type</tt> will be <tt>ma:MediaResource</tt> and <tt>ma:VideoTrack</tt></li>
-  <li><tt>ma:title</tt> will be the title of the video on Vimeo</li>
-  <li><tt>ma:locator</tt> will be the URL of the actual video on the platform</li>
-  <li><tt>ma:hasCreator</tt> and <tt>ma:hasPublisher</tt> will point to the resource URI of the user who uploaded the video</li>
-  <li><tt>ma:date</tt> will be set to the publication date of the video</li>
-  <li><tt>ma:description</tt> will contain the textual description provided by the publisher</li>
-  <li><tt>ma:hasKeyword</tt> will contain the keywords the publisher associated with the video</li>
-  <li><tt>ma:duration</tt> will contain the duration of the video in seconds</li>
-  <li><tt>foaf:thumbnail</tt> will contain references to thumbnail images of the video</li>
-  <li><tt>sioc:num_views</tt> will contain the number of views of the video at the time of retrieval</li>
-  <li><tt>sioc:num_replies</tt> will contain the number of comments on the video at the time of retrieval</li>
+  <li><tt>rdf:type</tt> is <tt>ma:MediaResource</tt> and <tt>ma:VideoTrack</tt></li>
+  <li><tt>ma:title</tt> is the title of the video on Vimeo</li>
+  <li><tt>ma:locator</tt> is the URL of the actual video on the platform</li>
+  <li><tt>ma:hasCreator</tt> and <tt>ma:hasPublisher</tt> point to the resource URI of the user who uploaded the video</li>
+  <li><tt>ma:date</tt> is set to the publication date of the video</li>
+  <li><tt>ma:description</tt> contains the textual description provided by the publisher</li>
+  <li><tt>ma:hasKeyword</tt> contains the keywords the publisher associated with the video</li>
+  <li><tt>ma:duration</tt> contains the duration of the video in seconds</li>
+  <li><tt>foaf:thumbnail</tt> contains references to thumbnail images of the video</li>
+  <li><tt>sioc:num_views</tt> contains the number of views of the video at the time of retrieval</li>
+  <li><tt>sioc:num_replies</tt> contains the number of comments on the video at the time of retrieval</li>
 </ul><p>Similarly, channels and playlists are represented using the Media Collections offered by the Ontology for Media Resources:</p>
 <ul>
-  <li><tt>rdf:type</tt> will be <tt>ma:Collection</tt></li>
+  <li><tt>rdf:type</tt> is <tt>ma:Collection</tt></li>
   <li><tt>ma:hasMember</tt> points to the resource URIs of all videos contained in the channel or group</li>
 </ul></div><div class="section"><h2>MediaWiki Module<a name="MediaWiki_Module"></a></h2><p>The MediaWiki module allows accessing content and metadata of wiki articles managed by a <a class="externalLink" href="http://wwww.mediawiki.org">MediaWiki</a> system (the most prominent being Wikipedia). When requesting a resource that represents a wiki article, this module will instead retrieve the article from the MediaWiki API offered by the wiki installation and provide access to many metadata properties otherwise not accessible. To use the MediaWiki module in your own project, add the following dependency to your project build:</p>
 <div class="source"><pre class="prettyprint">&lt;dependency&gt;
@@ -364,29 +375,71 @@
 &lt;/dependency&gt;
 </pre></div><p>The module auto-registers endpoints that map requests to all language versions of Wikipedia to the respective API service endpoint, so if you want to request only Wikipedia articles, no further action is needed. For any other MediaWiki installation, it is necessary to create an endpoint configuration and add it to the LDClient instance you are using (see the WikipediaPageEndpoint source code as an example).</p><p>The MediaWiki provider creates triples for a wiki page using SIOC and Dublin Core as follows:</p>
 <ul>
-  <li><tt>rdf:type</tt> will be <tt>sioc:WikiArticle</tt></li>
-  <li><tt>dct:title</tt> will be the title of the wiki page</li>
-  <li><tt>dct:identifier</tt> will be the page identifier in the MediaWiki database</li>
-  <li><tt>dct:modified</tt> will be the last modification date of the page</li>
-  <li><tt>dct:created</tt> will be the creation date of the page</li>
-  <li><tt>content:encoded</tt> will be the content of the page in MediaWiki syntax</li>
-  <li><tt>sioc:topic</tt> will be references to all resource URIs of MediaWiki categories used by the wiki page</li>
-  <li><tt>sioc:links_to</tt> will be references to all resource URIs of wiki pages the page links to</li>
-</ul></div><div class="section"><h2>PHPBB Module<a name="PHPBB_Module"></a></h2><p>The PHPBB module tries to parse the HTML pages generated by a PHPBB discussion forum and extract posts and threads from the content. Since PHPBB does not offer an RSS feed or service API, this extraction is unreliable at best, because it depends heavily on the layout and theming of the page. We tried to make it as generic as possible, though. If you want to try out the module, include the following dependency to your project build:</p>
+  <li><tt>rdf:type</tt> is <tt>sioc:WikiArticle</tt></li>
+  <li><tt>dct:title</tt> is the title of the wiki page</li>
+  <li><tt>dct:identifier</tt> is the page identifier in the MediaWiki database</li>
+  <li><tt>dct:modified</tt> is the last modification date of the page</li>
+  <li><tt>dct:created</tt> is the creation date of the page</li>
+  <li><tt>content:encoded</tt> is the content of the page in MediaWiki syntax</li>
+  <li><tt>sioc:topic</tt> is references to all resource URIs of MediaWiki categories used by the wiki page</li>
+  <li><tt>sioc:links_to</tt> is references to all resource URIs of wiki pages the page links to</li>
+</ul></div><div class="section"><h2>PHPBB Module (Experimental)<a name="PHPBB_Module_Experimental"></a></h2><p>The PHPBB module tries to parse the HTML pages generated by a PHPBB discussion forum and extract posts and threads from the content. Since PHPBB does not offer an RSS feed or service API, this extraction is unreliable at best, because it depends heavily on the layout and theming of the page. We tried to make it as generic as possible, though. If you want to try out the module, include the following dependency to your project build:</p>
 <div class="source"><pre class="prettyprint">&lt;dependency&gt;
     &lt;groupId&gt;org.apache.marmotta&lt;/groupId&gt;
     &lt;artifactId&gt;ldclient-provider-phpbb&lt;/artifactId&gt;
     &lt;version&gt;3.0.0-incubating-SNAPSHOT&lt;/version&gt;
 &lt;/dependency&gt;
 </pre></div><p>The module does not auto-register any endpoints, but it offers simplified endpoint classes that allow a more convenient configuration of endpoints. If you want to configure all endpoints for a PHPBB site, simply use</p>
-<div class="source"><pre class="prettyprint">PHPBBEndpoints.getEndpoints(&quot;http://www.carving-ski.de/phpBB/,&quot;Carving Ski Forum&quot;);
-</pre></div><p>where the first argument is the URL of the PHPBB installation and the second argument gives a human-readable name to the endpoint configuration. The method returns a set of endpoints that can be directly added to an LDClient instance.</p></div><div class="section"><h2>LDAP Module<a name="LDAP_Module"></a></h2>
+<div class="source"><pre class="prettyprint">ClientConfiguration config = new ClientConfiguration();
+config.getEndpoints().addAll(PHPBBEndpoints.getEndpoints(&quot;http://www.carving-ski.de/phpBB/,&quot;Carving Ski Forum&quot;));
+
+LDClient ldclient = new LDClient(config);
+</pre></div><p>where the first argument is the URL of the PHPBB installation and the second argument gives a human-readable name to the endpoint configuration. The method returns a set of endpoints that can be directly added to an LDClient instance.</p><p>The PHPBB module will map information from posts to RDF using the SIOC and Dublin Core vocabulary as follows:</p>
+<ul>
+  <li><tt>rdf:type</tt> is <tt>sioc:Post</tt>, <tt>sioc-types:BoardPost</tt> and <tt>foaf:Document</tt></li>
+  <li><tt>dc:title</tt> is the post title; some cleanup will be performed to remove common Re: and similar prefixes</li>
+  <li><tt>dc:creator</tt> is the name of the user as found in the post</li>
+  <li><tt>dc:description</tt> is the content of the post</li>
+  <li><tt>dc:date</tt> is the date of the post</li>
+</ul><p>A topic or thread in a PHPBB forum will be mapped to RDF as follows:</p>
+<ul>
+  <li><tt>rdf:type</tt> is <tt>sioc:Thread</tt>, <tt>sioc:Collection</tt> and <tt>foaf:Document</tt></li>
+  <li><tt>dc:title</tt> is the thread or topic title</li>
+  <li><tt>dc:creator</tt> is the name of the user who started the thread/topic</li>
+  <li><tt>dc:date</tt> is the date when the topic/thread was started</li>
+  <li><tt>sioc:has_container</tt> points to the resource URI of the forum this thread belongs to</li>
+  <li><tt>sioc:container_of</tt> points to all resource URIs of the posts contained in the thread</li>
+</ul><p>Finally, a forum in a PHPBB installation will be mapped to RDF as follows:</p>
+<ul>
+  <li><tt>rdf:type</tt> is <tt>sioc:Forum</tt>, <tt>sioc:Collection</tt> and <tt>foaf:Document</tt></li>
+  <li><tt>dc:title</tt> is the forum title</li>
+  <li><tt>dc:description</tt> contains a short description of the forum (as given on the webpage)</li>
+  <li><tt>sioc:container_of</tt> points to all resource URIs of the threads/topics contained in the forum</li>
+  <li><tt>sioc:parent_of</tt> points to all resource URIs of subforums contained in the forum</li>
+</ul></div><div class="section"><h2>LDAP Module (Experimental)<a name="LDAP_Module_Experimental"></a></h2><p>The LDAP module allows accessing a LDAP directory to get information about users. LDAP properties will be mapped to the FOAF vocabulary. Endpoint configurations need to provide the two properties loginDN and loginPW to support LDAP directories with access control. In order to use the LDAP module in your project, add the following dependency to your project build:</p>
 <div class="source"><pre class="prettyprint">&lt;dependency&gt;
     &lt;groupId&gt;org.apache.marmotta&lt;/groupId&gt;
     &lt;artifactId&gt;ldclient-provider-ldap&lt;/artifactId&gt;
     &lt;version&gt;3.0.0-incubating-SNAPSHOT&lt;/version&gt;
 &lt;/dependency&gt;
-</pre></div></div>
+</pre></div><p>The module does not auto-register any endpoints. You need to create an endpoint configuration for each LDAP directory you want to access. In most cases the LDAP directory will require a login to access the user data. These can be configured using the loginDN and loginPW properties:</p>
+<div class="source"><pre class="prettyprint">Endpoint endpoint = new Endpoint(&quot;mydirectory&quot;, &quot;LdapFoafProvider&quot;, &quot;ldap://mydirectory.com:389/SRFG/USERS/.*&quot;, &quot;ldap://mydirectory.com:389/dc=salzburgresearch,dc=at&quot;, null, 86400L);
+endpoint.setProperty(&quot;loginDN&quot;,&quot;login&quot;);
+endpoint.setProperty(&quot;loginPW&quot;,&quot;password&quot;);
+
+ClientConfiguration config = new ClientConfiguration();
+config.addEndpoint(endpoint);
+
+LDClient ldclient = new LDClient(config);
+</pre></div><p>The LDAP module will map LDAP properties to FOAF properties as follows:</p>
+<ul>
+  <li>the <tt>rdf:type</tt> is <tt>foaf:Person</tt></li>
+  <li><tt>distinguishedName</tt> is mapped to <tt>dct:identifier</tt></li>
+  <li><tt>name</tt> is mapped to <tt>foaf:name</tt></li>
+  <li><tt>givenName</tt> is mapped to <tt>foaf:firstName</tt></li>
+  <li><tt>sn</tt> is mapped to <tt>foaf:surname</tt></li>
+  <li><tt>mail</tt> is mapped to <tt>foaf:mbox</tt></li>
+</ul></div>
                   </div>
           </div>