You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jena.apache.org by bu...@apache.org on 2018/04/21 22:36:11 UTC
svn commit: r1028757 - in /websites/staging/jena/trunk/content: ./
documentation/query/text-query.html
Author: buildbot
Date: Sat Apr 21 22:36:11 2018
New Revision: 1028757
Log:
Staging update by buildbot for jena
Modified:
websites/staging/jena/trunk/content/ (props changed)
websites/staging/jena/trunk/content/documentation/query/text-query.html
Propchange: websites/staging/jena/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Sat Apr 21 22:36:11 2018
@@ -1 +1 @@
-1829755
+1829756
Modified: websites/staging/jena/trunk/content/documentation/query/text-query.html
==============================================================================
--- websites/staging/jena/trunk/content/documentation/query/text-query.html (original)
+++ websites/staging/jena/trunk/content/documentation/query/text-query.html Sat Apr 21 22:36:11 2018
@@ -1456,6 +1456,36 @@ and filters, as in:</p>
</pre></div>
+<p>And after 3.7.1 users are able to use the JenaText custom filter <code>SelectiveFoldingFilter</code>.
+This filter is not part of the Apache Lucene, but rather a custom implementation available
+for JenaText users.</p>
+<p>It is based on the Apache Lucene's <code>ASCIIFoldingFilter</code>, but with the addition of a
+white-list for characters that must not be replaced. This is especially useful for languages
+where some special characters and diacritical marks are useful when searching.</p>
+<p>Here's an example:</p>
+<div class="codehilite"><pre><span class="n">text</span><span class="o">:</span><span class="n">defineAnalyzers</span> <span class="o">(</span>
+ <span class="o">[</span> <span class="n">text</span><span class="o">:</span><span class="n">defineAnalyzer</span> <span class="o">:</span><span class="n">configuredAnalyzer</span> <span class="o">;</span>
+ <span class="n">text</span><span class="o">:</span><span class="n">analyzer</span> <span class="o">[</span>
+ <span class="n">a</span> <span class="n">text</span><span class="o">:</span><span class="n">ConfigurableAnalyzer</span> <span class="o">;</span>
+ <span class="n">text</span><span class="o">:</span><span class="n">tokenizer</span> <span class="o">:</span><span class="n">tokenizer</span> <span class="o">;</span>
+ <span class="n">text</span><span class="o">:</span><span class="n">filters</span> <span class="o">(</span> <span class="o">:</span><span class="n">selectiveFoldingFilter</span> <span class="n">text</span><span class="o">:</span><span class="n">LowerCaseFilter</span> <span class="o">)</span> <span class="o">]</span> <span class="o">]</span>
+ <span class="o">[</span> <span class="n">text</span><span class="o">:</span><span class="n">defineTokenizer</span> <span class="o">:</span><span class="n">tokenizer</span> <span class="o">;</span>
+ <span class="n">text</span><span class="o">:</span><span class="n">tokenizer</span> <span class="o">[</span>
+ <span class="n">a</span> <span class="n">text</span><span class="o">:</span><span class="n">GenericTokenizer</span> <span class="o">;</span>
+ <span class="n">text</span><span class="o">:</span><span class="kd">class</span> <span class="s2">"org.apache.lucene.analysis.core.LowerCaseTokenizer"</span> <span class="o">]</span> <span class="o">]</span>
+ <span class="o">[</span> <span class="n">text</span><span class="o">:</span><span class="n">defineFilter</span> <span class="o">:</span><span class="n">selectiveFoldingFilter</span> <span class="o">;</span>
+ <span class="n">text</span><span class="o">:</span><span class="n">filter</span> <span class="o">[</span>
+ <span class="n">a</span> <span class="n">text</span><span class="o">:</span><span class="n">GenericFilter</span> <span class="o">;</span>
+ <span class="n">text</span><span class="o">:</span><span class="kd">class</span> <span class="s2">"org.apache.jena.query.text.filter.SelectiveFoldingFilter"</span> <span class="o">;</span>
+ <span class="n">text</span><span class="o">:</span><span class="n">params</span> <span class="o">(</span>
+ <span class="o">[</span> <span class="n">text</span><span class="o">:</span><span class="n">paramName</span> <span class="s2">"whitelisted"</span> <span class="o">;</span>
+ <span class="n">text</span><span class="o">:</span><span class="n">paramType</span> <span class="n">text</span><span class="o">:</span><span class="n">TypeSet</span> <span class="o">;</span>
+ <span class="n">text</span><span class="o">:</span><span class="n">paramValue</span> <span class="o">(</span><span class="s2">"ç"</span> <span class="s2">"ä"</span><span class="o">)</span> <span class="o">]</span>
+ <span class="o">)</span> <span class="o">]</span> <span class="o">]</span>
+ <span class="o">)</span> <span class="o">;</span>
+</pre></div>
+
+
<h5 id="extending-multilingual-support">Extending multilingual support<a class="headerlink" href="#extending-multilingual-support" title="Permanent link">¶</a></h5>
<p>The <a href="#multilingual-support">Multilingual Support</a> described above allows for a limited set of
ISO 2-letter codes to be used to select from among built-in analyzers using the nullary constructor