You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jena.apache.org by ki...@apache.org on 2018/04/21 22:35:49 UTC
svn commit: r1829756 -
/jena/site/trunk/content/documentation/query/text-query.mdtext
Author: kinow
Date: Sat Apr 21 22:35:49 2018
New Revision: 1829756
URL: http://svn.apache.org/viewvc?rev=1829756&view=rev
Log:
JENA-1488: add selective folding filter documentation
Modified:
jena/site/trunk/content/documentation/query/text-query.mdtext
Modified: jena/site/trunk/content/documentation/query/text-query.mdtext
URL: http://svn.apache.org/viewvc/jena/site/trunk/content/documentation/query/text-query.mdtext?rev=1829756&r1=1829755&r2=1829756&view=diff
==============================================================================
--- jena/site/trunk/content/documentation/query/text-query.mdtext (original)
+++ jena/site/trunk/content/documentation/query/text-query.mdtext Sat Apr 21 22:35:49 2018
@@ -1246,6 +1246,37 @@ and filters, as in:
) ] ]
) ;
+And after 3.7.1 users are able to use the JenaText custom filter `SelectiveFoldingFilter`.
+This filter is not part of the Apache Lucene, but rather a custom implementation available
+for JenaText users.
+
+It is based on the Apache Lucene's `ASCIIFoldingFilter`, but with the addition of a
+white-list for characters that must not be replaced. This is especially useful for languages
+where some special characters and diacritical marks are useful when searching.
+
+Here's an example:
+
+ text:defineAnalyzers (
+ [ text:defineAnalyzer :configuredAnalyzer ;
+ text:analyzer [
+ a text:ConfigurableAnalyzer ;
+ text:tokenizer :tokenizer ;
+ text:filters ( :selectiveFoldingFilter text:LowerCaseFilter ) ] ]
+ [ text:defineTokenizer :tokenizer ;
+ text:tokenizer [
+ a text:GenericTokenizer ;
+ text:class "org.apache.lucene.analysis.core.LowerCaseTokenizer" ] ]
+ [ text:defineFilter :selectiveFoldingFilter ;
+ text:filter [
+ a text:GenericFilter ;
+ text:class "org.apache.jena.query.text.filter.SelectiveFoldingFilter" ;
+ text:params (
+ [ text:paramName "whitelisted" ;
+ text:paramType text:TypeSet ;
+ text:paramValue ("ç" "ä") ]
+ ) ] ]
+ ) ;
+
##### Extending multilingual support
The [Multilingual Support](#multilingual-support) described above allows for a limited set of