You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@jena.apache.org by os...@apache.org on 2015/11/17 11:14:08 UTC

svn commit: r1714748 - /jena/site/trunk/content/documentation/query/text-query.mdtext

Author: osma
Date: Tue Nov 17 10:14:08 2015
New Revision: 1714748

URL: http://svn.apache.org/viewvc?rev=1714748&view=rev
Log:
update jena-text documentation for JENA-1062 (ConfigurableAnalyzer)

Modified:
    jena/site/trunk/content/documentation/query/text-query.mdtext

Modified: jena/site/trunk/content/documentation/query/text-query.mdtext
URL: http://svn.apache.org/viewvc/jena/site/trunk/content/documentation/query/text-query.mdtext?rev=1714748&r1=1714747&r2=1714748&view=diff
==============================================================================
--- jena/site/trunk/content/documentation/query/text-query.mdtext (original)
+++ jena/site/trunk/content/documentation/query/text-query.mdtext Tue Nov 17 10:14:08 2015
@@ -276,18 +276,47 @@ Lucene index.  For example:
 will configure the index to analyze values of the 'text' field
 using a `StandardAnalyzer` with the given list of stop words.
 
-Other analyzer types that may be specified are `SimpleAnalyzer` and `KeywordAnalyzer`,
-neither of which has any configuration parameters. See the Lucene documentation
-for details of what these analyzers do. 
-In addition, Jena provides `LowerCaseKeywordAnalyzer`,
-which is a case-insensitive version of `KeywordAnalyzer`.
-
-In Jena 3.0.0:
-
-Support for the new `LocalizedAnalyzer` has been introduced to deal with Lucene 
-language specific analyzers. 
-See [Linguistic Support with Lucene Index](#linguistic-support-with-lucene-index)
-part for details.
+Other analyzer types that may be specified are `SimpleAnalyzer` and
+`KeywordAnalyzer`, neither of which has any configuration parameters. See
+the Lucene documentation for details of what these analyzers do. Jena also
+provides `LowerCaseKeywordAnalyzer`, which is a case-insensitive version of
+`KeywordAnalyzer`, and `ConfigurableAnalyzer` (see below).
+
+Support for the new `LocalizedAnalyzer` has been introduced in Jena 3.0.0 to
+deal with Lucene language specific analyzers. See [Linguistic Support with
+Lucene Index](#linguistic-support-with-lucene-index) part for details.
+
+#### ConfigurableAnalyzer
+
+`ConfigurableAnalyzer` was introduced in Jena 3.0.1. It allows more detailed
+configuration of text analysis parameters by independently selecting a
+`Tokenizer` and zero or more `TokenFilter`s which are applied in order after
+tokenization. See the Lucene documentation for details on what each
+tokenizer and token filter does.
+
+The available `Tokenizer` implementations are:
+
+* `StandardTokenizer`
+* `KeywordTokenizer`
+* `WhitespaceTokenizer`
+* `LetterTokenizer`
+
+The available `TokenFilter` implementations are:
+
+* `StandardFilter`
+* `LowerCaseFilter`
+* `ASCIIFoldingFilter`
+
+Configuration is done using Jena assembler like this:
+
+    text:analyzer [
+      a text:ConfigurableAnalyzer ;
+      text:tokenizer text:KeywordTokenizer ;
+      text:filters (text:ASCIIFoldingFilter, text:LowerCaseFilter)
+    ]
+
+Here, `text:tokenizer` must be one of the four tokenizers listed above and
+the optional `text:filters` property specifies a list of token filters.
 
 #### Analyzer for Query