You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@marmotta.apache.org by ss...@apache.org on 2013/02/27 22:54:54 UTC

svn commit: r1450986 - in /incubator/marmotta/site/trunk/content: markdown/ldcache/backends.md.vm markdown/ldcache/introduction.md.vm markdown/ldcache/usage.md.vm markdown/ldcache/wrappers.md.vm markdown/ldpath/language.md.vm site.xml

Author: sschaffert
Date: Wed Feb 27 21:54:53 2013
New Revision: 1450986

URL: http://svn.apache.org/r1450986
Log:
started with documentation for LDCache and LDPath

Added:
    incubator/marmotta/site/trunk/content/markdown/ldcache/backends.md.vm
    incubator/marmotta/site/trunk/content/markdown/ldcache/usage.md.vm
    incubator/marmotta/site/trunk/content/markdown/ldcache/wrappers.md.vm
Modified:
    incubator/marmotta/site/trunk/content/markdown/ldcache/introduction.md.vm
    incubator/marmotta/site/trunk/content/markdown/ldpath/language.md.vm
    incubator/marmotta/site/trunk/content/site.xml

Added: incubator/marmotta/site/trunk/content/markdown/ldcache/backends.md.vm
URL: http://svn.apache.org/viewvc/incubator/marmotta/site/trunk/content/markdown/ldcache/backends.md.vm?rev=1450986&view=auto
==============================================================================
--- incubator/marmotta/site/trunk/content/markdown/ldcache/backends.md.vm (added)
+++ incubator/marmotta/site/trunk/content/markdown/ldcache/backends.md.vm Wed Feb 27 21:54:53 2013
@@ -0,0 +1,3 @@
+# LDCache Backends
+
+@TODO@
\ No newline at end of file

Modified: incubator/marmotta/site/trunk/content/markdown/ldcache/introduction.md.vm
URL: http://svn.apache.org/viewvc/incubator/marmotta/site/trunk/content/markdown/ldcache/introduction.md.vm?rev=1450986&r1=1450985&r2=1450986&view=diff
==============================================================================
--- incubator/marmotta/site/trunk/content/markdown/ldcache/introduction.md.vm (original)
+++ incubator/marmotta/site/trunk/content/markdown/ldcache/introduction.md.vm Wed Feb 27 21:54:53 2013
@@ -1,2 +1,60 @@
-# Linked Data Caching #
+# Linked Data Caching
 
+The Linked Data Caching library offers caching facilities for Linked Data resources. In particular, it provides the
+following two major functionalities:
+
+* it caches the triple results of [LDClient](../ldclient/introduction.html) requests, taking into account expiry times,
+  and dynamically refreshes results if needed
+* it offers transparent access to resources in the Linked Data Cloud by wrapping the repository connection and
+  fetching remote triples as needed (i.e. if a call to `getStatements(...)` contains as a subject a resource that is
+  considered external) and including them into the repository results
+
+The Linked Data Caching library consists of a number of modules that can be combined as needed by the project using it.
+In particular, it offers different caching backends and different implementations for hooking into the repository
+connection as needed.
+
+Core Modules
+------------
+
+Like all Apache Marmotta libraries, the core LDCache consists of two libraries:
+
+* **ldcache-api**: contains interfaces and model needed for implementing backends and wrappers
+* **ldcache-core**: contains the main implementation of the caching functionalits
+
+Both modules need to be included in order to be able to use the module (see [usage](usage.html)).
+
+Caching Backends
+----------------
+
+Caching backends provide different means for storing the cached triples and corresponding metadata like expiry time.
+Currently, LDCache offers the following caching backends:
+
+* **ldcache-backend-kiwi**: caches the triples in the underlying KiWi triple store in a separate named graph, and
+  stores the caching metadata into a new database table; this implementation is used by the Apache Marmotta platform
+  and provides very efficient transparent caching
+* **ldcache-backend-ehcache**: caches the triples and caching metadata in a (volatile) [EHCache](http://ehcache.org/)
+  cache; since the cache is purely in-memory (the EHCache Open Source edition does not offer persistent caching),
+  the cache will be lost when the system is stopped
+  (**Note:** this cache backend is not yet completed.)
+* **ldcache-backend-mapdb**: caches the triples and caching metadata in a (persistent) [MapDB](http://www.mapdb.org/)
+  cache; the cache is persisted to disk and will be restored when the system is restarted
+  (**Note:** this cache backend is not yet completed.)
+
+
+Connection Wrappers
+-------------------
+
+Connection wrappers overwrite the `getStatements(...)` method of a repository connection and include the triples from
+the Linked Data cache in the result.  Currently, LDCache offers the following connection wrappers:
+
+* **ldcache-sail-generic**: implements a connection wrapper for any kind of Sesame Sail repository and any kind of
+  backend; in case `getStatements(...)` is called, it also retrieves the triples from the cache and merges the
+  results before returning (using a UnionIteration)
+  (**Note:** this connection wrapper is not yet completed.)
+* **ldcache-sail-kiwi**: implements a connection wrapper for a KiWi repository using a KiWi LDCache backend; this is a
+  special case that allows certain optimisations that are not possible for generic wrappers, as the cached triples are
+  already stored in the same triple store; it merely triggers a refresh of the cache when getStatements is called
+
+Since it is not possible to query the whole Linked Data Cloud, connection wrappers will typically only work in case
+the subject parameter of `getStatements(...)` is a URI resource and not a wildcard. While this seems a strong restriction,
+it actually gives rise to many interesting use cases, e.g. with the query language [LDPath](../ldpath/introduction.html).

Added: incubator/marmotta/site/trunk/content/markdown/ldcache/usage.md.vm
URL: http://svn.apache.org/viewvc/incubator/marmotta/site/trunk/content/markdown/ldcache/usage.md.vm?rev=1450986&view=auto
==============================================================================
--- incubator/marmotta/site/trunk/content/markdown/ldcache/usage.md.vm (added)
+++ incubator/marmotta/site/trunk/content/markdown/ldcache/usage.md.vm Wed Feb 27 21:54:53 2013
@@ -0,0 +1,4 @@
+# LDCache Usage
+
+@TODO@
+

Added: incubator/marmotta/site/trunk/content/markdown/ldcache/wrappers.md.vm
URL: http://svn.apache.org/viewvc/incubator/marmotta/site/trunk/content/markdown/ldcache/wrappers.md.vm?rev=1450986&view=auto
==============================================================================
--- incubator/marmotta/site/trunk/content/markdown/ldcache/wrappers.md.vm (added)
+++ incubator/marmotta/site/trunk/content/markdown/ldcache/wrappers.md.vm Wed Feb 27 21:54:53 2013
@@ -0,0 +1,3 @@
+# LDCache Connection Wrappers
+
+@TODO@
\ No newline at end of file

Modified: incubator/marmotta/site/trunk/content/markdown/ldpath/language.md.vm
URL: http://svn.apache.org/viewvc/incubator/marmotta/site/trunk/content/markdown/ldpath/language.md.vm?rev=1450986&r1=1450985&r2=1450986&view=diff
==============================================================================
--- incubator/marmotta/site/trunk/content/markdown/ldpath/language.md.vm (original)
+++ incubator/marmotta/site/trunk/content/markdown/ldpath/language.md.vm Wed Feb 27 21:54:53 2013
@@ -1,3 +1,416 @@
-# LDPath Language Reference #
+# LDPath Language Reference
 
-@TODO@
\ No newline at end of file
+Introduction
+------------
+
+LDPath (fka RDF Path) follows a syntax similar to XPath that can be used for selecting the values of properties.
+It has originally been developed for configuring the semantic search component of the
+[Linked Media Framework](http://code.google.com/p/lmf/), but has been moved out into a separate, generic module. 
+It still contains some syntactical constructs that are only relevant for configuring semantic search; these are marked 
+explicitly below.
+
+LDPath Syntax
+-------------
+
+ The path language supports the following constructs:
+
+<h3> Namespace Definitions </h3>
+
+Define shortcut names for URI prefixes, like in SPARQL or N3.
+
+**Syntax:**
+
+    @prefix PREFIX : <URI> ;
+
+where PREFIX is a shortcut name for the uri URI.
+
+**Examples:**
+
+Define "foaf" as the label for the prefix http://xmlns.com/foaf/0.1/
+
+    @prefix foaf : <http://xmlns.com/foaf/0.1/>;
+
+**Defaults:**
+
+Some common prefixes are predefined and always available without being specified. These are:
+
+    rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
+    rdfs: <http://www.w3.org/2000/01/rdf-schema#>
+    owl: <http://www.w3.org/2002/07/owl#>
+    skos: <http://www.w3.org/2004/02/skos/core#>
+    dc: <http://purl.org/dc/elements/1.1/>
+    xsd: <http://www.w3.org/2001/XMLSchema#>           (LMF base index datatypes/XML Schema)
+    lmf: <http://www.newmedialab.at/lmf/types/1.0/>    (LMF extended index datatypes)
+    fn: <http://www.newmedialab.at/lmf/functions/1.0/> (LMF index functions)
+
+<h3> Search specific constructs </h3>
+
+<h4> Filter Definition (Search Indexing only) </h4>
+
+    @filter TEST
+
+The search index will only contain nodes (documents) that pass the defined filter. See Value Testing below.
+
+<h4> Boost Definition (Search Indexing only) </h4>
+
+    @boost PATH
+
+apply a document boost based on the given `PATH`. Must evaluate to float, default is `1.0`.
+
+<h3> Field Definitions </h3>
+
+Define fields in the search index to map to path definitions.
+
+    FIELDNAME = PATH :: FIELDTYPE FIELDCONF
+
+where `PATH` is an `RDF` path and `FIELDTYPE` is one of the available field types. `FIELDCONF` is an optional field
+configuration.
+
+**Example:**
+
+Select the name of the person
+
+    title = foaf:name :: xsd:string ;
+
+<h3> Path Selectors </h3>
+
+The path language supports a number of path selectors that start at the current "context node" and return a collection
+of nodes when applied. The following is a short overview over the different selectors, detailed documentation follows
+below:
+* Property Selections (`URI` or `prefix:local`): select the values of a property
+* Reverse Property Selections (`^URI` or `^prefix:local`)
+* Wildcard Selections (`*`): select the values of all properties
+* Self Selector (`.`): select the current context node
+* Path Traversal (`/`): follow a path of selectors recursively
+* Unions (`|`): join the results of two selections in one collection
+* Intersections (`&`): build the intersection of the results of two selections
+* Recursive Selections (`(PATH)+`)
+* Tests (`[...]`): filter the collection based on test criteria
+** Language Test (`@language`): only literals of a certain language
+** Type Test (`^^xsdtype`): only literals of a certain type
+** Path Value Test (`is`): only resources with a subpath yielding a given value
+** Path Existance Test (`PATH`): only resources where a subpath yields some value
+* Functions (`f(...)`): apply a function on the values of the selections passed as argument
+
+<h4> Property Selections </h4>
+
+A path definition selecting the value of a property. Either a URI enclosed in <> or a namespace prefix and a local name
+separated by `:`
+
+   <URI> | PREFIX:LOCAL
+
+**Example:**
+
+Select the foaf:name property using namespace prefix and local name:
+
+    title = foaf:name :: xsd:string ;
+
+Select the foaf:name property using the fully qualified URI
+
+    title = <http://xmlns.com/foaf/0.1/name> :: xsd:string;
+
+<h4> Reverse Property Selections </h4>
+
+This is the reverse/inverse operation of the normal *Property Selection*.
+
+Select all nodes connected to the current node via an incoming link, aka. go the specified link "backwards":
+
+    ^<URI>
+
+**Example:**
+
+Get the names of all persons that link the current person as `foaf:knows`:
+
+    friends = ^foaf:knows / foaf:name :: xsd:string;
+
+Note, however, that this kind of navigation is not completely supported
+over the Linked Data Cloud. While it might be against the ethos of RDF,
+Linked Data restricts these principles. In most cases it is not possible to
+know the incoming links from external datasets to a resource.
+
+<h4> Wildcard Selections </h4>
+
+Wildcard selections allow to select the values of all properties of a node. They are expressed by a `*`:
+
+    *
+
+**Example:**
+
+Select all properties of the resource:
+
+    all = * :: xsd:string ;
+
+<h4> Path Traversal </h4>
+
+Traverse a path by following several edges in the RDF graph. Each step is separated by a `/`.
+
+    PATH / PATH
+
+Where PATH is an arbitrary path selector (e.g. a property selection or test)
+
+**Example:**
+
+Select the names of all friends:
+
+    friend = foaf:knows/foaf:name :: xsd:string;
+
+<h4> Unions </h4>
+
+Several alternative paths can be merged by using a union `|` between path elements
+
+    PATH | PATH
+
+Where PATH is an arbitrary path selector.
+
+**Example:**
+
+Select the labels or names of all friends:
+
+    friend = foaf:knows/foaf:name | foaf:knows/rdfs:label :: xsd:string;
+
+<h4> Intersections </h4>
+
+The intersection of several paths can be computed by using an intersection `&` between path elements
+
+    PATH & PATH
+
+Where PATH is an arbitrary path selector.
+
+**Example:**
+
+Select values that are both defined for `foaf:interest` and `foaf:topic_interest`:
+
+    topic_interests = foaf:interest & foaf:topic_interest :: xsd:anyURI;
+
+<h4> Recursive Selections </h4>
+Recurseive selection will apply an selectore recursively. The recursion can be limited with lower and upper bounds.
+
+    (<SELECTOR>)*
+    (<SELECTOR>)+
+    (<SELECTOR>){n,m}
+    (<SELECTOR>){n,}
+    (<SELECTOR>){,m}
+
+The specified selector is applied *zero-and-more* (`*`), *one-and-more* (`+`),  *n-to-m-times* (`{n,m}`),
+*n-and-more-times* (`{n,}`) and *up-to-m-times* (`{,m}`).
+
+<h4> Groupings </h4>
+
+Path expressions can be grouped to change precedence or to improve readability by including them in braces:
+
+    ( PATH )
+
+where PATH is an arbitrary path selector.
+
+**Example:**
+
+Select the labes or names of friends:
+
+    friend = foaf:knows/(foaf:name | rdfs:label) :: xsd:string ;
+
+<h4> Value Testing </h4>
+
+The values of selections can be tested and filtered by adding test conditions in square brackets `[]` after a path selection:
+
+    PATH [TEST]
+
+where PATH is an arbitrary path selector and TEST is a test condition (see below).
+
+
+<h5> Literal Language Test </h5>
+
+Literal language tests allow to select literal values of only the specified language. They can be expressed by `@`
+followed by the ISO language tag or the special value `none` to select literals without language definition.
+
+    @LANGUAGE
+
+where LANGUAGE is the ISO language tag or the value "none".
+
+**Example:**
+
+Select labels with either German language or no defined language:
+
+    title = rdfs:label[@de] | rdfs:label[@none] :: xsd:string ;
+
+<h5> Literal Type Test </h5>
+
+Literal type tests allow to select only literals of a specified type, e.g. to ensure that only decimal values are indexed:
+
+    ^^TYPE
+
+where TYPE is the XML Schema type to select.
+
+**Example:**
+
+Select all literal values of type xsd:decimal:
+
+    decimals = *[^^xsd:decimal] :: xsd:decimal
+
+<h5> Resource Path Value Tests </h5>
+
+Resource path value tests only allow resources where a subpath selection matches a certain value condition:
+
+    PATH is VALUE
+
+where PATH is an arbitrary path selection and VALUE is a URI, prefix:local, or literal value definition.
+
+**Example:**
+
+Select all interests of type ex:Food (path condition):
+
+    food = foaf:interest[rdf:type is ex:Food] :: xsd:anyURI;
+
+<h5> Resource Path Existance Tests </h5>
+
+Resource path existance tests only allow resources where a subpath selection selects at least some value:
+
+    PATH
+
+where PATH is an arbitrary path selection.
+
+**Example:**
+
+Select all friends with a foaf:name defined:
+
+    friends = foaf:knows[foaf:name] :: xsd:anyURI;
+
+<h5> Test Conjunction and Disjunction </h5>
+
+Several tests can be connected using `&` (for conjunction/and) or `|` (for disjunction/or).
+
+**Example:**
+
+Select all interests of type ex:Food or type ex:Drink:
+
+    foodstuff = foaf:interest[rdf:type is ex:Food | rdf:type is ex:Drink] :: xsd:anyURI ;
+
+Select all interests of type ex:Food and type ex:Drink:
+
+    fluidfood = foaf:interest[rdf:type is ex:Food & rdf:type is ex:Drink] :: xsd:anyURI ;
+
+<h5> Combinations of Tests </h5>
+
+A path traversal can contain several tests.
+
+**Example:**
+
+Select the Spanish label of all interests of type ex:Food
+
+    foodstuff = foaf:interest[rdf:type is ex:Food]/rdfs:label[@es] :: xsd:string ;
+
+<h3> Functions </h3>
+
+Functions can be used inside the path to transform the results of path queries. A function takes the form
+
+    f(PARAMETERS)
+
+where PARAMETERS is a list of path expressions or direct literals. A function selector returns no result if any of its parameters returns no result.
+
+<h4> Builtin: String Concatenation </h4>
+
+String concatenation is a very frequently needed function and is already built in:
+
+Concatenate the foaf:given and foaf:surname of all friends:
+
+    friends = foaf:knows/fn:concat(foaf:given," ",foaf:surname) :: xsd:string ;
+
+Combination of concatenation and union:
+
+    friends = foaf:knows/(fn:concat(foaf:given," ",foaf:surname) | foaf:name) :: xsd:string ;
+
+<h4> Builtin: List Operation (First and Last) </h4>
+
+Selects the first (last) non-empty argument.
+
+    label = fn:first(skos:prefLabel[@de], skos:prefLabel) :: xsd:string ;
+
+<h4> Builtin: String Cleansing </h4>
+
+Removes all html/xml tags from the string representations of the arguments.
+
+    content = fn:removeTags(ex:hasHtmlContent) :: xsd:string ;
+
+<h4> Builtin: XPath </h4>
+
+Nodes are transformed into string representation and interpreted as XML. Given XPath expression is evaluated on these
+XML-Docs.
+
+    title = fn:xpath("//head/title/text()", ex:hasHtmlContent) :: xsd:string ;
+
+<h4> Builtin: Resource Content </h4>
+
+In some cases you will want to retrieve the human-readable content (e.g. in HTML) that is associated with a resource,
+e.g. a website. The RDF Path language has a builtin function to achieve this goal:
+
+    content = fn:content(.) :: lmf:text_en ;
+
+would retrieve the content of the current resource and index it as English text. Arbitrary paths can be used, so this
+function can also be used to retrieve the content of resource selected by more complex path selections:
+
+    homepage_content = fn:content(foaf:homepage) :: lmf:text_en ;
+
+would retrieve the homepage of a foaf:Person and index it as English text.
+
+Index Types (Search Indexing only)
+----------------------------------
+
+All XML Schema built-in datatypes (http://www.w3.org/TR/xmlschema-2/) are supported as field types in the SOLR index.
+In particular, the following field types are frequently used:
+
+* xsd:anyURI - URI values, indexed without further processing like tokenization
+* xsd:string - String values, indexed without further processing like tokenization
+* xsd:decimal - decimal numbers, indexed as long values with range querying
+* xsd:double - double (floating point) numbers, indexed as double values with range querying
+* xsd:dateTime - dates and timestamps, indexed as dates with range querying
+
+In addition to the XML Schema base types, the following extended field types are supported:
+
+* lmf:text_en - English language text, indexed with tokenization and English stopwords, lowercase and stemming
+* lmf:text_de - German language text, indexed with tokenization and German stopwords, lowercase and stemming
+* lmf:text_es - Spanish language text, indexed with tokenization and Spanish stopwords, lowercase and stemming
+* lmf:text_fr - French language text, indexed with tokenization and French stopwords, lowercase and stemming
+* lmf:text_it - Italian language text, indexed with tokenization and Italian stopwords, lowercase and stemming
+* lmf:text_cz - Czech language text, indexed with tokenization and Czech stopwords, lowercase and stemming
+* lmf:location - convert node into [[http://wiki.apache.org/solr/SpatialSearch#LatLonType|solr.LatLonType]], using {{{wgs84:lat,wgs84:long}}} as properties
+* lmf:geohash - convert node into [[http://wiki.apache.org/solr/SpatialSearchDev#Geohash|solr.GeoHashField]], using {{{wgs84:lat,wgs84:long}}} as properties
+* lmf:location_s - interpret evaluated string as [[http://wiki.apache.org/solr/SpatialSearch#LatLonType|solr.LatLonType]]
+* lmf:geohash_s - interpret evaluated string as [[http://wiki.apache.org/solr/SpatialSearchDev#Geohash|solr.GeoHashField]]
+
+Field Configuration (Search Indexing only)
+------------------------------------------
+
+Field Configuration allows additional index configuration for each field.
+
+    ( PROP="VAL", ... )
+
+`PROP`s are currently used for the generation of the solr schema.xml (see also [SOLR Field options](http://wiki.apache.org/solr/SchemaXml#Common_field_options)).
+The following properties are interpreted by the LMF:
+
+* `indexed` - (default: `true`) True if this field should be "indexed", required for searching, sorting, and faceting
+* `stored` - (default: `true`) True if the value of the field should be retrievable during a search
+* `compressed` - (default: `false`) True if the value of the field should be retrievable during a search
+* `compressThreshold`
+* `multiValued` - (default: `true`, except for `LatLonType`s) True if this field may contain multiple values
+* `omitNorms`
+* `omitTermFreqAndPositions`
+* `termVectors`
+* `termPositions`
+* `termOffsets`
+* `copy` - (default: `text_all`) Comma-separated list of fields to copy this fields value to. (see [SOLR Copy Fields](http://wiki.apache.org/solr/SchemaXml#Copy_Fields))
+
+
+Examples
+--------
+
+
+<h3> FOAF </h3>
+
+    @prefix foaf : <http://xmlns.com/foaf/0.1/> ;
+    @prefix geo : <http://www.w3.org/2003/01/geo/wgs84_pos#> ;
+    title      = foaf:name | fn:concat(foaf:givename," ",foaf:surname) :: xsd:string ;
+    summary    = dc:description :: lmf:text ;
+    geo        = foaf:based_near :: lmf:location;
+    interest   = foaf:interest / (rdfs:label[@en] | rdfs:label[@none] | <http://rdf.freebase.com/ns/type.object.name>[@en]) :: xsd:string;
+    friends    = foaf:knows / (foaf:name | fn:concat(foaf:givename," ",foaf:surname)) :: xsd:string;
+    contrycode = foaf:based_near / <http://www.geonames.org/ontology#countryCode> :: xsd:string ;
+    type       = rdf:type :: xsd:anyURI ;

Modified: incubator/marmotta/site/trunk/content/site.xml
URL: http://svn.apache.org/viewvc/incubator/marmotta/site/trunk/content/site.xml?rev=1450986&r1=1450985&r2=1450986&view=diff
==============================================================================
--- incubator/marmotta/site/trunk/content/site.xml (original)
+++ incubator/marmotta/site/trunk/content/site.xml Wed Feb 27 21:54:53 2013
@@ -96,6 +96,9 @@
         </menu>
         <menu name="LDCache">
             <item name="Introduction" href="ldcache/introduction.html"/>
+            <item name="Usage" href="ldcache/usage.html"/>
+            <item name="Backends" href="ldcache/backends.html"/>
+            <item name="Wrappers" href="ldcache/wrappers.html"/>
         </menu>
         <menu name="LDPath">
             <item name="Introduction" href="ldpath/introduction.html"/>