You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jena.apache.org by bu...@apache.org on 2019/02/22 21:31:32 UTC
svn commit: r1040794 - in /websites/staging/jena/trunk/content: ./
documentation/query/text-query.html
Author: buildbot
Date: Fri Feb 22 21:31:32 2019
New Revision: 1040794
Log:
Staging update by buildbot for jena
Modified:
websites/staging/jena/trunk/content/ (props changed)
websites/staging/jena/trunk/content/documentation/query/text-query.html
Propchange: websites/staging/jena/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Feb 22 21:31:32 2019
@@ -1 +1 @@
-1854173
+1854174
Modified: websites/staging/jena/trunk/content/documentation/query/text-query.html
==============================================================================
--- websites/staging/jena/trunk/content/documentation/query/text-query.html (original)
+++ websites/staging/jena/trunk/content/documentation/query/text-query.html Fri Feb 22 21:31:32 2019
@@ -856,54 +856,93 @@ itself.</p>
<p>For simple RDF use, there will be one field, mapping a property to a text
index field. More complex setups, with multiple properties per entity
(URI) are possible.</p>
+<p>The assembler file can be either default configuration file (.../run/config.ttl)
+or a custom file in ...run/configuration folder. Note that you can use several files
+simultaneously.</p>
+<p>You have to edit the file (see comments in the assembler code below):</p>
+<ol>
+<li>provide values for paths and a fixed URI for tdb:DatasetTDB</li>
+<li>modify the entity map : add the fields you want to index and desired options (filters, tokenizers...)</li>
+</ol>
+<p>If your assembler file is run/config.ttl, you can index the dataset with this command :</p>
+<div class="codehilite"><pre><span class="n">java</span> <span class="o">-</span><span class="n">cp</span> <span class="o">./</span><span class="n">fuseki</span><span class="o">-</span><span class="n">server</span><span class="p">.</span><span class="n">jar</span> <span class="n">jena</span><span class="p">.</span><span class="n">textindexer</span> <span class="o">--</span><span class="n">desc</span><span class="p">=</span><span class="n">run</span><span class="o">/</span><span class="n">config</span><span class="p">.</span><span class="n">ttl</span>
+</pre></div>
+
+
<p>Once configured, any data added to the text dataset is automatically
-indexed as well.</p>
+indexed as well: <a href="#building-a-text-index">Building a Text Index</a>.</p>
<h3 id="text-dataset-assembler">Text Dataset Assembler<a class="headerlink" href="#text-dataset-assembler" title="Permanent link">¶</a></h3>
-<p>The following is an example of a TDB dataset with a text index.</p>
-<div class="codehilite"><pre><span class="n">PREFIX</span> <span class="p">:</span> <span class="o"><</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">localhost</span><span class="o">/</span><span class="n">jena_example</span><span class="o">/</span>#<span class="o">></span>
-<span class="n">PREFIX</span> <span class="n">rdf</span><span class="p">:</span> <span class="o"><</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">www</span><span class="p">.</span><span class="n">w3</span><span class="p">.</span><span class="n">org</span><span class="o">/</span>1999<span class="o">/</span>02<span class="o">/</span>22<span class="o">-</span><span class="n">rdf</span><span class="o">-</span><span class="n">syntax</span><span class="o">-</span><span class="n">ns</span>#<span class="o">></span>
-<span class="n">PREFIX</span> <span class="n">rdfs</span><span class="p">:</span> <span class="o"><</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">www</span><span class="p">.</span><span class="n">w3</span><span class="p">.</span><span class="n">org</span><span class="o">/</span>2000<span class="o">/</span>01<span class="o">/</span><span class="n">rdf</span><span class="o">-</span><span class="n">schema</span>#<span class="o">></span>
-<span class="n">PREFIX</span> <span class="n">tdb</span><span class="p">:</span> <span class="o"><</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">jena</span><span class="p">.</span><span class="n">hpl</span><span class="p">.</span><span class="n">hp</span><span class="p">.</span><span class="n">com</span><span class="o">/</span>2008<span class="o">/</span><span class="n">tdb</span>#<span class="o">></span>
-<span class="n">PREFIX</span> <span class="n">ja</span><span class="p">:</span> <span class="o"><</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">jena</span><span class="p">.</span><span class="n">hpl</span><span class="p">.</span><span class="n">hp</span><span class="p">.</span><span class="n">com</span><span class="o">/</span>2005<span class="o">/</span>11<span class="o">/</span><span class="n">Assembler</span>#<span class="o">></span>
-<span class="n">PREFIX</span> <span class="n">text</span><span class="p">:</span> <span class="o"><</span><span class="n">http</span><span class="p">:</span><span class="o">//</span><span class="n">jena</span><span class="p">.</span><span class="n">apache</span><span class="p">.</span><span class="n">org</span><span class="o">/</span><span class="n">text</span>#<span class="o">></span>
-
-## <span class="n">Example</span> <span class="n">of</span> <span class="n">a</span> <span class="n">TDB</span> <span class="n">dataset</span> <span class="n">and</span> <span class="n">text</span> <span class="n">index</span>
-
-# <span class="n">A</span> <span class="n">TextDataset</span> <span class="n">is</span> <span class="n">a</span> <span class="n">regular</span> <span class="n">dataset</span> <span class="n">with</span> <span class="n">a</span> <span class="n">text</span> <span class="n">index</span><span class="p">.</span>
-<span class="n">text</span><span class="p">:</span><span class="n">TextDataset</span> <span class="n">rdfs</span><span class="p">:</span><span class="n">subClassOf</span> <span class="n">ja</span><span class="p">:</span><span class="n">RDFDataset</span> <span class="p">.</span>
-# <span class="n">Lucene</span> <span class="n">index</span>
-<span class="n">text</span><span class="p">:</span><span class="n">TextIndexLucene</span> <span class="n">rdfs</span><span class="p">:</span><span class="n">subClassOf</span> <span class="n">text</span><span class="p">:</span><span class="n">TextIndex</span> <span class="p">.</span>
-# <span class="n">Elasticsearch</span> <span class="n">index</span>
-<span class="n">text</span><span class="p">:</span><span class="n">TextIndexES</span> <span class="n">rdfs</span><span class="p">:</span><span class="n">subClassOf</span> <span class="n">text</span><span class="p">:</span><span class="n">TextIndex</span> <span class="p">.</span>
-
-## <span class="o">---------------------------------------------------------------</span>
-## <span class="n">This</span> <span class="n">URI</span> <span class="n">must</span> <span class="n">be</span> <span class="n">fixed</span> <span class="o">-</span> <span class="n">it</span><span class="o">'</span><span class="n">s</span> <span class="n">used</span> <span class="n">to</span> <span class="n">assemble</span> <span class="n">the</span> <span class="n">text</span> <span class="n">dataset</span><span class="p">.</span>
-
-<span class="p">:</span><span class="n">text_dataset</span> <span class="n">rdf</span><span class="p">:</span><span class="n">type</span> <span class="n">text</span><span class="p">:</span><span class="n">TextDataset</span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span class="n">dataset</span> <span class="o"><</span>#<span class="n">dataset</span><span class="o">></span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span class="n">index</span> <span class="o"><</span>#<span class="n">indexLucene</span><span class="o">></span> <span class="p">;</span>
- <span class="p">.</span>
-
-# <span class="n">A</span> <span class="n">TDB</span> <span class="n">dataset</span> <span class="n">used</span> <span class="k">for</span> <span class="n">RDF</span> <span class="n">storage</span>
-<span class="o"><</span>#<span class="n">dataset</span><span class="o">></span> <span class="n">rdf</span><span class="p">:</span><span class="n">type</span> <span class="n">tdb</span><span class="p">:</span><span class="n">DatasetTDB</span> <span class="p">;</span>
- <span class="n">tdb</span><span class="p">:</span><span class="n">location</span> "<span class="n">DB</span>" <span class="p">;</span>
- <span class="n">tdb</span><span class="p">:</span><span class="n">unionDefaultGraph</span> <span class="n">true</span> <span class="p">;</span> # <span class="n">Optional</span>
- <span class="p">.</span>
-
-# <span class="n">Text</span> <span class="n">index</span> <span class="n">description</span>
-<span class="o"><</span>#<span class="n">indexLucene</span><span class="o">></span> <span class="n">a</span> <span class="n">text</span><span class="p">:</span><span class="n">TextIndexLucene</span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span class="n">directory</span> <span class="o"><</span><span class="n">file</span><span class="p">:</span><span class="o">/</span><span class="n">some</span><span class="o">/</span><span class="n">path</span><span class="o">/</span><span class="n">lucene</span><span class="o">-</span><span class="n">index</span><span class="o">></span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span class="n">entityMap</span> <span class="o"><</span>#<span class="n">entMap</span><span class="o">></span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span class="n">storeValues</span> <span class="n">true</span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span class="n">analyzer</span> <span class="p">[</span> <span class="n">a</span> <span class="n">text</span><span class="p">:</span><span class="n">StandardAnalyzer</span> <span class="p">]</span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span class="n">queryAnalyzer</span> <span class="p">[</span> <span class="n">a</span> <span class="n">text</span><span class="p">:</span><span class="n">KeywordAnalyzer</span> <span class="p">]</span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span class="n">queryParser</span> <span class="n">text</span><span class="p">:</span><span class="n">AnalyzingQueryParser</span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span class="n">defineAnalyzers</span> <span class="p">[</span> <span class="p">.</span> <span class="p">.</span> <span class="p">.</span> <span class="p">]</span> <span class="p">;</span>
- <span class="n">text</span><span class="p">:</span><span class="n">multilingualSupport</span> <span class="n">true</span> <span class="p">;</span>
- <span class="p">.</span>
+<p>The following is an example of an assembler file defining a TDB dataset with a Lucene text index.</p>
+<div class="codehilite"><pre><span class="c1">######## Example of a TDB dataset and text index#########################</span>
+<span class="c1"># The main doc sources are:</span>
+<span class="c1"># - https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html</span>
+<span class="c1"># - https://jena.apache.org/documentation/assembler/assembler-howto.html</span>
+<span class="c1"># - https://jena.apache.org/documentation/assembler/assembler.ttl</span>
+<span class="c1"># See https://jena.apache.org/documentation/fuseki2/fuseki-layout.html for the destination of this file.</span>
+<span class="c1">#########################################################################</span>
+
+<span class="p">@</span>prefix : <span class="o"><</span>http:<span class="o">//</span>localhost<span class="o">/</span>jena_example<span class="o">/</span><span class="c1">#> .</span>
+<span class="p">@</span>prefix rdf: <span class="o"><</span>http:<span class="o">//</span>www.w3.org<span class="o">/</span><span class="m">1999</span><span class="o">/</span><span class="m">02</span><span class="o">/</span><span class="m">22</span><span class="o">-</span>rdf<span class="o">-</span>syntax<span class="o">-</span>ns<span class="c1">#> .</span>
+<span class="p">@</span>prefix rdfs: <span class="o"><</span>http:<span class="o">//</span>www.w3.org<span class="o">/</span><span class="m">2000</span><span class="o">/</span><span class="m">01</span><span class="o">/</span>rdf<span class="o">-</span>schema<span class="c1">#> .</span>
+<span class="p">@</span>prefix tdb: <span class="o"><</span>http:<span class="o">//</span>jena.hpl.hp.com<span class="o">/</span><span class="m">2008</span><span class="o">/</span>tdb<span class="c1">#> .</span>
+<span class="p">@</span>prefix text: <span class="o"><</span>http:<span class="o">//</span>jena.apache.org<span class="o">/</span>text<span class="c1">#> .</span>
+<span class="p">@</span>prefix skos: <span class="o"><</span>http:<span class="o">//</span>www.w3.org<span class="o">/</span><span class="m">2004</span><span class="o">/</span><span class="m">02</span><span class="o">/</span>skos<span class="o">/</span>core<span class="c1">#></span>
+<span class="p">@</span>prefix fuseki: <span class="o"><</span>http:<span class="o">//</span>jena.apache.org<span class="o">/</span>fuseki<span class="c1">#> .</span>
+
+<span class="p">[]</span> rdf:type fuseki:Server <span class="p">;</span>
+ fuseki:services <span class="p">(</span>
+ :myservice
+ <span class="p">)</span> <span class="m">.</span>
+
+:myservice rdf:type fuseki:Service <span class="p">;</span>
+ fuseki:name <span class="s">"myds"</span> <span class="p">;</span> <span class="c1"># e.g : `s-query --service=http://localhost:3030/myds "select * ..."`</span>
+ fuseki:serviceQuery <span class="s">"query"</span> <span class="p">;</span> <span class="c1"># SPARQL query service</span>
+ fuseki:serviceUpdate <span class="s">"update"</span> <span class="p">;</span> <span class="c1"># SPARQL update service</span>
+ fuseki:serviceUpload <span class="s">"upload"</span> <span class="p">;</span> <span class="c1"># Non-SPARQL upload service</span>
+ fuseki:serviceReadWriteGraphStore <span class="s">"data"</span> <span class="p">;</span> <span class="c1"># SPARQL Graph store protocol (read and write)</span>
+ fuseki:dataset :text_dataset <span class="p">;</span>
+ <span class="m">.</span>
+
+<span class="c1">## ---------------------------------------------------------------</span>
+
+<span class="c1"># A TextDataset is a regular dataset with a text index.</span>
+:text_dataset rdf:type text:TextDataset <span class="p">;</span>
+ text:dataset :mydataset <span class="p">;</span> <span class="c1"># <-- replace `:my_dataset` with the desired URI</span>
+ text:index <span class="o"><</span><span class="c1">#indexLucene> ;</span>
+<span class="m">.</span>
+
+<span class="c1"># A TDB dataset used for RDF storage</span>
+:mydataset rdf:type tdb:DatasetTDB <span class="p">;</span> <span class="c1"># <-- replace `:my_dataset` with the desired URI - as above</span>
+ tdb:location <span class="s">"DB"</span> <span class="p">;</span>
+ tdb:unionDefaultGraph true <span class="p">;</span> <span class="c1"># Optional</span>
+<span class="m">.</span>
+
+<span class="c1"># Text index description</span>
+<span class="o"><</span><span class="c1">#indexLucene> a text:TextIndexLucene ;</span>
+ text:directory <span class="o"><</span>file:path<span class="o">></span> <span class="p">;</span> <span class="c1"># <-- replace `<file:path>` with your path (e.g., `<file:/.../fuseki/run/databases/MY_INDEX>`)</span>
+ text:entityMap <span class="o"><</span><span class="c1">#entMap> ;</span>
+ text:storeValues true <span class="p">;</span>
+ text:analyzer <span class="p">[</span> a text:StandardAnalyzer <span class="p">]</span> <span class="p">;</span>
+ text:queryAnalyzer <span class="p">[</span> a text:KeywordAnalyzer <span class="p">]</span> <span class="p">;</span>
+ text:queryParser text:AnalyzingQueryParser <span class="p">;</span>
+ text:defineAnalyzers <span class="p">[</span> <span class="m">.</span> <span class="m">.</span> <span class="m">.</span> <span class="p">]</span> <span class="p">;</span>
+ text:multilingualSupport true <span class="p">;</span> <span class="c1"># optional</span>
+<span class="m">.</span>
+<span class="c1"># Entity map (see documentation for other options)</span>
+<span class="o"><</span><span class="c1">#entMap> a text:EntityMap ;</span>
+ text:defaultField <span class="s">"label"</span> <span class="p">;</span>
+ text:entityField <span class="s">"uri"</span> <span class="p">;</span>
+ text:uidField <span class="s">"uid"</span> <span class="p">;</span>
+ text:langField <span class="s">"lang"</span> <span class="p">;</span>
+ text:graphField <span class="s">"graph"</span> <span class="p">;</span>
+ text:map <span class="p">(</span>
+ <span class="p">[</span> text:field <span class="s">"label"</span> <span class="p">;</span>
+ text:predicate skos:prefLabel <span class="p">]</span>
+ <span class="p">)</span> <span class="m">.</span>
</pre></div>
+<p>See below for <a href="#entity-map-definition">more on defining an entity map</a></p>
<p>The <code>text:TextDataset</code> has two properties:</p>
<ul>
<li>