You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jena.apache.org by bu...@apache.org on 2019/02/22 21:31:32 UTC

svn commit: r1040794 - in /websites/staging/jena/trunk/content: ./ documentation/query/text-query.html

Author: buildbot
Date: Fri Feb 22 21:31:32 2019
New Revision: 1040794

Log:
Staging update by buildbot for jena

Modified:
    websites/staging/jena/trunk/content/   (props changed)
    websites/staging/jena/trunk/content/documentation/query/text-query.html

Propchange: websites/staging/jena/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Feb 22 21:31:32 2019
@@ -1 +1 @@
-1854173
+1854174

Modified: websites/staging/jena/trunk/content/documentation/query/text-query.html
==============================================================================
--- websites/staging/jena/trunk/content/documentation/query/text-query.html (original)
+++ websites/staging/jena/trunk/content/documentation/query/text-query.html Fri Feb 22 21:31:32 2019
@@ -856,54 +856,93 @@ itself.</p>
 <p>For simple RDF use, there will be one field, mapping a property to a text
 index field. More complex setups, with multiple properties per entity
 (URI) are possible.</p>
+<p>The assembler file can be either default configuration file (.../run/config.ttl)
+or a custom file in ...run/configuration folder. Note that you can use several files
+simultaneously.</p>
+<p>You have to edit the file (see comments in the assembler code below):</p>
+<ol>
+<li>provide values for paths and a fixed URI for tdb:DatasetTDB</li>
+<li>modify the entity map : add the fields you want to index and desired options (filters, tokenizers...)</li>
+</ol>
+<p>If your assembler file is run/config.ttl, you can index the dataset with this command :</p>
+<div class="codehilite"><pre><span class="n">java</span> <span class="o">-</span><span class="n">cp</span> <span class="o">./</span><span class="n">fuseki</span><span class="o">-</span><span class="n">server</span><span class="p">.</span><span class="n">jar</span> <span class="n">jena</span><span class="p">.</span><span class="n">textindexer</span> <span class="o">--</span><span class="n">desc</span><span class="p">=</span><span class="n">run</span><span class="o">/</span><span class="n">config</span><span class="p">.</span><span class="n">ttl</span>
+</pre></div>
+
+
 <p>Once configured, any data added to the text dataset is automatically
-indexed as well.</p>
+indexed as well: <a href="#building-a-text-index">Building a Text Index</a>.</p>
 <h3 id="text-dataset-assembler">Text Dataset Assembler<a class="headerlink" href="#text-dataset-assembler" title="Permanent link">&para;</a></h3>
-<p>The following is an example of a TDB dataset with a text index.</p>
-<div class="codehilite"><pre><span class="n">PREFIX</span> <span class="p">:</span>        <span class="o">&lt;</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">localhost</span><span class="o">/</span><span class="n">jena_example</span><span class="o">/</span>#<span class="o">&gt;</span>
-<span class="n">PREFIX</span> <span class="n">rdf</span><span class="p">:</span>     <span class="o">&lt;</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">www</span><span class="p">.</span><span class="n">w3</span><span class="p">.</span><span class="n">org</span><span class="o">/</span>1999<span class="o">/</span>02<span class="o">/</span>22<span class="o">-</span><span class="n">rdf</span><span class="o">-</span><span class="n">syntax</span><span class="o">-</span><span class="n">ns</span>#<span class="o">&gt;</span>
-<span class="n">PREFIX</span> <span class="n">rdfs</span><span class="p">:</span>    <span class="o">&lt;</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">www</span><span class="p">.</span><span class="n">w3</span><span class="p">.</span><span class="n">org</span><span class="o">/</span>2000<span class="o">/</span>01<span class="o">/</span><span class="n">rdf</span><span class="o">-</span><span class="n">schema</span>#<span class="o">&gt;</span>
-<span class="n">PREFIX</span> <span class="n">tdb</span><span class="p">:</span>     <span class="o">&lt;</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">jena</span><span class="p">.</span><span class="n">hpl</span><span class="p">.</span><span class="n">hp</span><span class="p">.</span><span class="n">com</span><span class="o">/</span>2008<span class="o">/</span><span class="n">tdb</span>#<span class="o">&gt;</span>
-<span class="n">PREFIX</span> <span class="n">ja</span><span class="p">:</span>      <span class="o">&lt;</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">jena</span><span class="p">.</span><span class="n">hpl</span><span class="p">.</span><span class="n">hp</span><span class="p">.</span><span class="n">com</span><span class="o">/</span>2005<span class="o">/</span>11<span class="o">/</span><span class="n">Assembler</span>#<span class="o">&gt;</span>
-<span class="n">PREFIX</span> <span class="n">text</span><span class="p">:</span>    <span class="o">&lt;</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">jena</span><span class="p">.</span><span class="n">apache</span><span class="p">.</span><span class="n">org</span><span class="o">/</span><span class="n">text</span>#<span class="o">&gt;</span>
-
-## <span class="n">Example</span> <span class="n">of</span> <span class="n">a</span> <span class="n">TDB</span> <span class="n">dataset</span> <span class="n">and</span> <span class="n">text</span> <span class="n">index</span>
-
-# <span class="n">A</span> <span class="n">TextDataset</span> <span class="n">is</span> <span class="n">a</span> <span class="n">regular</span> <span class="n">dataset</span> <span class="n">with</span> <span class="n">a</span> <span class="n">text</span> <span class="n">index</span><span class="p">.</span>
-<span class="n">text</span><span class="p">:</span><span class="n">TextDataset</span>      <span class="n">rdfs</span><span class="p">:</span><span class="n">subClassOf</span>   <span class="n">ja</span><span class="p">:</span><span class="n">RDFDataset</span> <span class="p">.</span>
-# <span class="n">Lucene</span> <span class="n">index</span>
-<span class="n">text</span><span class="p">:</span><span class="n">TextIndexLucene</span>  <span class="n">rdfs</span><span class="p">:</span><span class="n">subClassOf</span>   <span class="n">text</span><span class="p">:</span><span class="n">TextIndex</span> <span class="p">.</span>
-# <span class="n">Elasticsearch</span> <span class="n">index</span>
-<span class="n">text</span><span class="p">:</span><span class="n">TextIndexES</span>    <span class="n">rdfs</span><span class="p">:</span><span class="n">subClassOf</span>   <span class="n">text</span><span class="p">:</span><span class="n">TextIndex</span> <span class="p">.</span>
-
-## <span class="o">---------------------------------------------------------------</span>
-## <span class="n">This</span> <span class="n">URI</span> <span class="n">must</span> <span class="n">be</span> <span class="n">fixed</span> <span class="o">-</span> <span class="n">it</span><span class="o">&#39;</span><span class="n">s</span> <span class="n">used</span> <span class="n">to</span> <span class="n">assemble</span> <span class="n">the</span> <span class="n">text</span> <span class="n">dataset</span><span class="p">.</span>
-
-<span class="p">:</span><span class="n">text_dataset</span> <span class="n">rdf</span><span class="p">:</span><span class="n">type</span>     <span class="n">text</span><span class="p">:</span><span class="n">TextDataset</span> <span class="p">;</span>
-    <span class="n">text</span><span class="p">:</span><span class="n">dataset</span>   <span class="o">&lt;</span>#<span class="n">dataset</span><span class="o">&gt;</span> <span class="p">;</span>
-    <span class="n">text</span><span class="p">:</span><span class="n">index</span>     <span class="o">&lt;</span>#<span class="n">indexLucene</span><span class="o">&gt;</span> <span class="p">;</span>
-    <span class="p">.</span>
-
-# <span class="n">A</span> <span class="n">TDB</span> <span class="n">dataset</span> <span class="n">used</span> <span class="k">for</span> <span class="n">RDF</span> <span class="n">storage</span>
-<span class="o">&lt;</span>#<span class="n">dataset</span><span class="o">&gt;</span> <span class="n">rdf</span><span class="p">:</span><span class="n">type</span>      <span class="n">tdb</span><span class="p">:</span><span class="n">DatasetTDB</span> <span class="p">;</span>
-    <span class="n">tdb</span><span class="p">:</span><span class="n">location</span> &quot;<span class="n">DB</span>&quot; <span class="p">;</span>
-    <span class="n">tdb</span><span class="p">:</span><span class="n">unionDefaultGraph</span> <span class="n">true</span> <span class="p">;</span> # <span class="n">Optional</span>
-    <span class="p">.</span>
-
-# <span class="n">Text</span> <span class="n">index</span> <span class="n">description</span>
-<span class="o">&lt;</span>#<span class="n">indexLucene</span><span class="o">&gt;</span> <span class="n">a</span> <span class="n">text</span><span class="p">:</span><span class="n">TextIndexLucene</span> <span class="p">;</span>
-    <span class="n">text</span><span class="p">:</span><span class="n">directory</span> <span class="o">&lt;</span><span class="n">file</span><span class="p">:</span><span class="o">/</span><span class="n">some</span><span class="o">/</span><span class="n">path</span><span class="o">/</span><span class="n">lucene</span><span class="o">-</span><span class="n">index</span><span class="o">&gt;</span> <span class="p">;</span>
-    <span class="n">text</span><span class="p">:</span><span class="n">entityMap</span> <span class="o">&lt;</span>#<span class="n">entMap</span><span class="o">&gt;</span> <span class="p">;</span>
-    <span class="n">text</span><span class="p">:</span><span class="n">storeValues</span> <span class="n">true</span> <span class="p">;</span> 
-    <span class="n">text</span><span class="p">:</span><span class="n">analyzer</span> <span class="p">[</span> <span class="n">a</span> <span class="n">text</span><span class="p">:</span><span class="n">StandardAnalyzer</span> <span class="p">]</span> <span class="p">;</span>
-    <span class="n">text</span><span class="p">:</span><span class="n">queryAnalyzer</span> <span class="p">[</span> <span class="n">a</span> <span class="n">text</span><span class="p">:</span><span class="n">KeywordAnalyzer</span> <span class="p">]</span> <span class="p">;</span>
-    <span class="n">text</span><span class="p">:</span><span class="n">queryParser</span> <span class="n">text</span><span class="p">:</span><span class="n">AnalyzingQueryParser</span> <span class="p">;</span>
-    <span class="n">text</span><span class="p">:</span><span class="n">defineAnalyzers</span> <span class="p">[</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">]</span> <span class="p">;</span>
-    <span class="n">text</span><span class="p">:</span><span class="n">multilingualSupport</span> <span class="n">true</span> <span class="p">;</span>
- <span class="p">.</span>
+<p>The following is an example of an assembler file defining a TDB dataset with a Lucene text index.</p>
+<div class="codehilite"><pre><span class="c1">######## Example of a TDB dataset and text index#########################</span>
+<span class="c1"># The main doc sources are:</span>
+<span class="c1">#  - https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html</span>
+<span class="c1">#  - https://jena.apache.org/documentation/assembler/assembler-howto.html</span>
+<span class="c1">#  - https://jena.apache.org/documentation/assembler/assembler.ttl</span>
+<span class="c1"># See https://jena.apache.org/documentation/fuseki2/fuseki-layout.html for the destination of this file.</span>
+<span class="c1">#########################################################################</span>
+
+<span class="p">@</span>prefix :        <span class="o">&lt;</span>http:<span class="o">//</span>localhost<span class="o">/</span>jena_example<span class="o">/</span><span class="c1">#&gt; .</span>
+<span class="p">@</span>prefix rdf:     <span class="o">&lt;</span>http:<span class="o">//</span>www.w3.org<span class="o">/</span><span class="m">1999</span><span class="o">/</span><span class="m">02</span><span class="o">/</span><span class="m">22</span><span class="o">-</span>rdf<span class="o">-</span>syntax<span class="o">-</span>ns<span class="c1">#&gt; .</span>
+<span class="p">@</span>prefix rdfs:    <span class="o">&lt;</span>http:<span class="o">//</span>www.w3.org<span class="o">/</span><span class="m">2000</span><span class="o">/</span><span class="m">01</span><span class="o">/</span>rdf<span class="o">-</span>schema<span class="c1">#&gt; .</span>
+<span class="p">@</span>prefix tdb:     <span class="o">&lt;</span>http:<span class="o">//</span>jena.hpl.hp.com<span class="o">/</span><span class="m">2008</span><span class="o">/</span>tdb<span class="c1">#&gt; .</span>
+<span class="p">@</span>prefix text:    <span class="o">&lt;</span>http:<span class="o">//</span>jena.apache.org<span class="o">/</span>text<span class="c1">#&gt; .</span>
+<span class="p">@</span>prefix skos:    <span class="o">&lt;</span>http:<span class="o">//</span>www.w3.org<span class="o">/</span><span class="m">2004</span><span class="o">/</span><span class="m">02</span><span class="o">/</span>skos<span class="o">/</span>core<span class="c1">#&gt;</span>
+<span class="p">@</span>prefix fuseki:  <span class="o">&lt;</span>http:<span class="o">//</span>jena.apache.org<span class="o">/</span>fuseki<span class="c1">#&gt; .</span>
+
+<span class="p">[]</span> rdf:type fuseki:Server <span class="p">;</span>
+   fuseki:services <span class="p">(</span>
+     :myservice
+   <span class="p">)</span> <span class="m">.</span>
+
+:myservice rdf:type fuseki:Service <span class="p">;</span>
+    fuseki:name                       <span class="s">&quot;myds&quot;</span> <span class="p">;</span>     <span class="c1"># e.g : `s-query --service=http://localhost:3030/myds &quot;select * ...&quot;`</span>
+    fuseki:serviceQuery               <span class="s">&quot;query&quot;</span> <span class="p">;</span>    <span class="c1"># SPARQL query service</span>
+    fuseki:serviceUpdate              <span class="s">&quot;update&quot;</span> <span class="p">;</span>   <span class="c1"># SPARQL update service</span>
+    fuseki:serviceUpload              <span class="s">&quot;upload&quot;</span> <span class="p">;</span>   <span class="c1"># Non-SPARQL upload service</span>
+    fuseki:serviceReadWriteGraphStore <span class="s">&quot;data&quot;</span> <span class="p">;</span>     <span class="c1"># SPARQL Graph store protocol (read and write)</span>
+    fuseki:dataset                    :text_dataset <span class="p">;</span>
+    <span class="m">.</span>
+
+<span class="c1">## ---------------------------------------------------------------</span>
+
+<span class="c1"># A TextDataset is a regular dataset with a text index.</span>
+:text_dataset rdf:type     text:TextDataset <span class="p">;</span>
+    text:dataset   :mydataset <span class="p">;</span> <span class="c1"># &lt;-- replace `:my_dataset` with the desired URI</span>
+    text:index     <span class="o">&lt;</span><span class="c1">#indexLucene&gt; ;</span>
+<span class="m">.</span>
+
+<span class="c1"># A TDB dataset used for RDF storage</span>
+:mydataset rdf:type      tdb:DatasetTDB <span class="p">;</span> <span class="c1"># &lt;-- replace `:my_dataset` with the desired URI - as above</span>
+    tdb:location <span class="s">&quot;DB&quot;</span> <span class="p">;</span>
+    tdb:unionDefaultGraph true <span class="p">;</span> <span class="c1"># Optional</span>
+<span class="m">.</span>
+
+<span class="c1"># Text index description</span>
+<span class="o">&lt;</span><span class="c1">#indexLucene&gt; a text:TextIndexLucene ;</span>
+    text:directory <span class="o">&lt;</span>file:path<span class="o">&gt;</span> <span class="p">;</span>  <span class="c1"># &lt;-- replace `&lt;file:path&gt;` with your path (e.g., `&lt;file:/.../fuseki/run/databases/MY_INDEX&gt;`)</span>
+    text:entityMap <span class="o">&lt;</span><span class="c1">#entMap&gt; ;</span>
+    text:storeValues true <span class="p">;</span> 
+    text:analyzer <span class="p">[</span> a text:StandardAnalyzer <span class="p">]</span> <span class="p">;</span>
+    text:queryAnalyzer <span class="p">[</span> a text:KeywordAnalyzer <span class="p">]</span> <span class="p">;</span>
+    text:queryParser text:AnalyzingQueryParser <span class="p">;</span>
+    text:defineAnalyzers <span class="p">[</span> <span class="m">.</span> <span class="m">.</span> <span class="m">.</span> <span class="p">]</span> <span class="p">;</span>
+    text:multilingualSupport true <span class="p">;</span> <span class="c1"># optional</span>
+<span class="m">.</span>
+<span class="c1"># Entity map (see documentation for other options)</span>
+<span class="o">&lt;</span><span class="c1">#entMap&gt; a text:EntityMap ;</span>
+    text:defaultField     <span class="s">&quot;label&quot;</span> <span class="p">;</span>
+    text:entityField      <span class="s">&quot;uri&quot;</span> <span class="p">;</span>
+    text:uidField         <span class="s">&quot;uid&quot;</span> <span class="p">;</span>
+    text:langField        <span class="s">&quot;lang&quot;</span> <span class="p">;</span>
+    text:graphField       <span class="s">&quot;graph&quot;</span> <span class="p">;</span>
+    text:map <span class="p">(</span>
+        <span class="p">[</span> text:field <span class="s">&quot;label&quot;</span> <span class="p">;</span> 
+          text:predicate skos:prefLabel <span class="p">]</span>
+    <span class="p">)</span> <span class="m">.</span>
 </pre></div>
 
 
+<p>See below for <a href="#entity-map-definition">more on defining an entity map</a></p>
 <p>The <code>text:TextDataset</code> has two properties:</p>
 <ul>
 <li>