You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by tf...@apache.org on 2017/05/12 23:38:52 UTC

[25/58] [abbrv] lucene-solr:jira/solr-10233: squash merge jira/solr-10290 into master

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/filter-descriptions.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/filter-descriptions.adoc b/solr/solr-ref-guide/src/filter-descriptions.adoc
new file mode 100644
index 0000000..f4c8e25
--- /dev/null
+++ b/solr/solr-ref-guide/src/filter-descriptions.adoc
@@ -0,0 +1,1779 @@
+= Filter Descriptions
+:page-shortname: filter-descriptions
+:page-permalink: filter-descriptions.html
+
+Filters examine a stream of tokens and keep them, transform them or discard them, depending on the filter type being used.
+
+You configure each filter with a `<filter>` element in `schema.xml` as a child of `<analyzer>`, following the `<tokenizer>` element. Filter definitions should follow a tokenizer or another filter definition because they take a `TokenStream` as input. For example:
+
+[source,xml]
+----
+<fieldType name="text" class="solr.TextField">
+  <analyzer type="index">
+    <tokenizer class="solr.StandardTokenizerFactory"/>
+    <filter class="solr.LowerCaseFilterFactory"/>...
+  </analyzer>
+</fieldType>
+----
+
+The class attribute names a factory class that will instantiate a filter object as needed. Filter factory classes must implement the `org.apache.solr.analysis.TokenFilterFactory` interface. Like tokenizers, filters are also instances of TokenStream and thus are producers of tokens. Unlike tokenizers, filters also consume tokens from a TokenStream. This allows you to mix and match filters, in any order you prefer, downstream of a tokenizer.
+
+Arguments may be passed to tokenizer factories to modify their behavior by setting attributes on the `<filter>` element. For example:
+
+[source,xml]
+----
+<fieldType name="semicolonDelimited" class="solr.TextField">
+  <analyzer type="query">
+    <tokenizer class="solr.PatternTokenizerFactory" pattern="; " />
+    <filter class="solr.LengthFilterFactory" min="2" max="7"/>
+  </analyzer>
+</fieldType>
+----
+
+The following sections describe the filter factories that are included in this release of Solr.
+
+For user tips about Solr's filters, see http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters.
+
+[[FilterDescriptions-ASCIIFoldingFilter]]
+== ASCII Folding Filter
+
+This filter converts alphabetic, numeric, and symbolic Unicode characters which are not in the Basic Latin Unicode block (the first 127 ASCII characters) to their ASCII equivalents, if one exists. This filter converts characters from the following Unicode blocks:
+
+* http://www.unicode.org/charts/PDF/U0080.pdf[C1 Controls and Latin-1 Supplement] (PDF)
+* http://www.unicode.org/charts/PDF/U0100.pdf[Latin Extended-A] (PDF)
+* http://www.unicode.org/charts/PDF/U0180.pdf[Latin Extended-B] (PDF)
+* http://www.unicode.org/charts/PDF/U1E00.pdf[Latin Extended Additional] (PDF)
+* http://www.unicode.org/charts/PDF/U2C60.pdf[Latin Extended-C] (PDF)
+* http://www.unicode.org/charts/PDF/UA720.pdf[Latin Extended-D] (PDF)
+* http://www.unicode.org/charts/PDF/U0250.pdf[IPA Extensions] (PDF)
+* http://www.unicode.org/charts/PDF/U1D00.pdf[Phonetic Extensions] (PDF)
+* http://www.unicode.org/charts/PDF/U1D80.pdf[Phonetic Extensions Supplement] (PDF)
+* http://www.unicode.org/charts/PDF/U2000.pdf[General Punctuation] (PDF)
+* http://www.unicode.org/charts/PDF/U2070.pdf[Superscripts and Subscripts] (PDF)
+* http://www.unicode.org/charts/PDF/U2460.pdf[Enclosed Alphanumerics] (PDF)
+* http://www.unicode.org/charts/PDF/U2700.pdf[Dingbats] (PDF)
+* http://www.unicode.org/charts/PDF/U2E00.pdf[Supplemental Punctuation] (PDF)
+* http://www.unicode.org/charts/PDF/UFB00.pdf[Alphabetic Presentation Forms] (PDF)
+* http://www.unicode.org/charts/PDF/UFF00.pdf[Halfwidth and Fullwidth Forms] (PDF)
+
+*Factory class:* `solr.ASCIIFoldingFilterFactory`
+
+*Arguments:*
+
+`preserveOriginal`:: (boolean, default false) If true, the original token is preserved: "thé" -> "the", "thé"
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.WhitespaceTokenizer"/>
+  <filter class="solr.ASCIIFoldingFilterFactory" preserveOriginal="false" />
+</analyzer>
+----
+
+*In:* "á" (Unicode character 00E1)
+
+*Out:* "a" (ASCII character 97)
+
+[[FilterDescriptions-Beider-MorseFilter]]
+== Beider-Morse Filter
+
+Implements the Beider-Morse Phonetic Matching (BMPM) algorithm, which allows identification of similar names, even if they are spelled differently or in different languages. More information about how this works is available in the section on <<phonetic-matching.adoc#PhoneticMatching-Beider-MorsePhoneticMatching_BMPM_,Phonetic Matching>>.
+
+[IMPORTANT]
+====
+BeiderMorseFilter changed its behavior in Solr 5.0 due to an update to version 3.04 of the BMPM algorithm. Older version of Solr implemented BMPM version 3.00 (see http://stevemorse.org/phoneticinfo.htm). Any index built using this filter with earlier versions of Solr will need to be rebuilt.
+====
+
+*Factory class:* `solr.BeiderMorseFilterFactory`
+
+*Arguments:*
+
+`nameType`:: Types of names. Valid values are GENERIC, ASHKENAZI, or SEPHARDIC. If not processing Ashkenazi or Sephardic names, use GENERIC.
+
+`ruleType`:: Types of rules to apply. Valid values are APPROX or EXACT.
+
+`concat`:: Defines if multiple possible matches should be combined with a pipe ("|").
+
+`languageSet`:: The language set to use. The value "auto" will allow the Filter to identify the language, or a comma-separated list can be supplied.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.BeiderMorseFilterFactory" nameType="GENERIC" ruleType="APPROX" concat="true" languageSet="auto">
+  </filter>
+</analyzer>
+----
+
+[[FilterDescriptions-ClassicFilter]]
+== Classic Filter
+
+This filter takes the output of the <<tokenizers.adoc#Tokenizers-ClassicTokenizer,Classic Tokenizer>> and strips periods from acronyms and "'s" from possessives.
+
+*Factory class:* `solr.ClassicFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.ClassicTokenizerFactory"/>
+  <filter class="solr.ClassicFilterFactory"/>
+</analyzer>
+----
+
+*In:* "I.B.M. cat's can't"
+
+*Tokenizer to Filter:* "I.B.M", "cat's", "can't"
+
+*Out:* "IBM", "cat", "can't"
+
+[[FilterDescriptions-CommonGramsFilter]]
+== Common Grams Filter
+
+This filter creates word shingles by combining common tokens such as stop words with regular tokens. This is useful for creating phrase queries containing common words, such as "the cat." Solr normally ignores stop words in queried phrases, so searching for "the cat" would return all matches for the word "cat."
+
+*Factory class:* `solr.CommonGramsFilterFactory`
+
+*Arguments:*
+
+`words`:: (a common word file in .txt format) Provide the name of a common word file, such as `stopwords.txt`.
+
+`format`:: (optional) If the stopwords list has been formatted for Snowball, you can specify `format="snowball"` so Solr can read the stopwords file.
+
+`ignoreCase`:: (boolean) If true, the filter ignores the case of words when comparing them to the common word file. The default is false.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.CommonGramsFilterFactory" words="stopwords.txt" ignoreCase="true"/>
+</analyzer>
+----
+
+*In:* "the Cat"
+
+*Tokenizer to Filter:* "the", "Cat"
+
+*Out:* "the_cat"
+
+[[FilterDescriptions-CollationKeyFilter]]
+== Collation Key Filter
+
+Collation allows sorting of text in a language-sensitive way. It is usually used for sorting, but can also be used with advanced searches. We've covered this in much more detail in the section on <<language-analysis.adoc#LanguageAnalysis-UnicodeCollation,Unicode Collation>>.
+
+[[FilterDescriptions-Daitch-MokotoffSoundexFilter]]
+== Daitch-Mokotoff Soundex Filter
+
+Implements the Daitch-Mokotoff Soundex algorithm, which allows identification of similar names, even if they are spelled differently. More information about how this works is available in the section on <<phonetic-matching.adoc#phonetic-matching,Phonetic Matching>>.
+
+*Factory class:* `solr.DaitchMokotoffSoundexFilterFactory`
+
+*Arguments:*
+
+`inject` :: (true/false) If true (the default), then new phonetic tokens are added to the stream. Otherwise, tokens are replaced with the phonetic equivalent. Setting this to false will enable phonetic matching, but the exact spelling of the target word may not match.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.DaitchMokotoffSoundexFilterFactory" inject="true"/>
+</analyzer>
+----
+
+[[FilterDescriptions-DoubleMetaphoneFilter]]
+== Double Metaphone Filter
+
+This filter creates tokens using the http://commons.apache.org/codec/apidocs/org/apache/commons/codec/language/DoubleMetaphone.html[`DoubleMetaphone`] encoding algorithm from commons-codec. For more information, see the <<phonetic-matching.adoc#phonetic-matching,Phonetic Matching>> section.
+
+*Factory class:* `solr.DoubleMetaphoneFilterFactory`
+
+*Arguments:*
+
+`inject`:: (true/false) If true (the default), then new phonetic tokens are added to the stream. Otherwise, tokens are replaced with the phonetic equivalent. Setting this to false will enable phonetic matching, but the exact spelling of the target word may not match.
+
+`maxCodeLength`:: (integer) The maximum length of the code to be generated.
+
+*Example:*
+
+Default behavior for inject (true): keep the original token and add phonetic token(s) at the same position.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.DoubleMetaphoneFilterFactory"/>
+</analyzer>
+----
+
+*In:* "four score and Kuczewski"
+
+*Tokenizer to Filter:* "four"(1), "score"(2), "and"(3), "Kuczewski"(4)
+
+*Out:* "four"(1), "FR"(1), "score"(2), "SKR"(2), "and"(3), "ANT"(3), "Kuczewski"(4), "KSSK"(4), "KXFS"(4)
+
+The phonetic tokens have a position increment of 0, which indicates that they are at the same position as the token they were derived from (immediately preceding). Note that "Kuczewski" has two encodings, which are added at the same position.
+
+*Example:*
+
+Discard original token (`inject="false"`).
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.DoubleMetaphoneFilterFactory" inject="false"/>
+</analyzer>
+----
+
+*In:* "four score and Kuczewski"
+
+*Tokenizer to Filter:* "four"(1), "score"(2), "and"(3), "Kuczewski"(4)
+
+*Out:* "FR"(1), "SKR"(2), "ANT"(3), "KSSK"(4), "KXFS"(4)
+
+Note that "Kuczewski" has two encodings, which are added at the same position.
+
+[[FilterDescriptions-EdgeN-GramFilter]]
+== Edge N-Gram Filter
+
+This filter generates edge n-gram tokens of sizes within the given range.
+
+*Factory class:* `solr.EdgeNGramFilterFactory`
+
+*Arguments:*
+
+`minGramSize`:: (integer, default 1) The minimum gram size.
+
+`maxGramSize`:: (integer, default 1) The maximum gram size.
+
+*Example:*
+
+Default behavior.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.EdgeNGramFilterFactory"/>
+</analyzer>
+----
+
+*In:* "four score and twenty"
+
+*Tokenizer to Filter:* "four", "score", "and", "twenty"
+
+*Out:* "f", "s", "a", "t"
+
+*Example:*
+
+A range of 1 to 4.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="4"/>
+</analyzer>
+----
+
+*In:* "four score"
+
+*Tokenizer to Filter:* "four", "score"
+
+*Out:* "f", "fo", "fou", "four", "s", "sc", "sco", "scor"
+
+*Example:*
+
+A range of 4 to 6.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.EdgeNGramFilterFactory" minGramSize="4" maxGramSize="6"/>
+</analyzer>
+----
+
+*In:* "four score and twenty"
+
+*Tokenizer to Filter:* "four", "score", "and", "twenty"
+
+*Out:* "four", "scor", "score", "twen", "twent", "twenty"
+
+[[FilterDescriptions-EnglishMinimalStemFilter]]
+== English Minimal Stem Filter
+
+This filter stems plural English words to their singular form.
+
+*Factory class:* `solr.EnglishMinimalStemFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+[source,xml]
+----
+<analyzer type="index">
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.EnglishMinimalStemFilterFactory"/>
+</analyzer>
+----
+
+*In:* "dogs cats"
+
+*Tokenizer to Filter:* "dogs", "cats"
+
+*Out:* "dog", "cat"
+
+[[FilterDescriptions-EnglishPossessiveFilter]]
+== English Possessive Filter
+
+This filter removes singular possessives (trailing *'s*) from words. Note that plural possessives, e.g. the *s'* in "divers' snorkels", are not removed by this filter.
+
+*Factory class:* `solr.EnglishPossessiveFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.EnglishPossessiveFilterFactory"/>
+</analyzer>
+----
+
+*In:* "Man's dog bites dogs' man"
+
+*Tokenizer to Filter:* "Man's", "dog", "bites", "dogs'", "man"
+
+*Out:* "Man", "dog", "bites", "dogs'", "man"
+
+[[FilterDescriptions-FingerprintFilter]]
+== Fingerprint Filter
+
+This filter outputs a single token which is a concatenation of the sorted and de-duplicated set of input tokens. This can be useful for clustering/linking use cases.
+
+*Factory class:* `solr.FingerprintFilterFactory`
+
+*Arguments:*
+
+`separator`:: The character used to separate tokens combined into the single output token. Defaults to " " (a space character).
+
+`maxOutputTokenSize`:: The maximum length of the summarized output token. If exceeded, no output token is emitted. Defaults to 1024.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer type="index">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.FingerprintFilterFactory" separator="_" />
+</analyzer>
+----
+
+*In:* "the quick brown fox jumped over the lazy dog"
+
+*Tokenizer to Filter:* "the", "quick", "brown", "fox", "jumped", "over", "the", "lazy", "dog"
+
+*Out:* "brown_dog_fox_jumped_lazy_over_quick_the"
+
+[[FilterDescriptions-FlattenGraphFilter]]
+== Flatten Graph Filter
+
+This filter must be included on index-time analyzer specifications that include at least one graph-aware filter, including Synonym Graph Filter and Word Delimiter Graph Filter.
+
+*Factory class:* `solr.FlattenGraphFilterFactory`
+
+*Arguments:* None
+
+See the examples below for <<Synonym Graph Filter>> and <<Word Delimiter Graph Filter>>.
+
+[[FilterDescriptions-HunspellStemFilter]]
+== Hunspell Stem Filter
+
+The `Hunspell Stem Filter` provides support for several languages. You must provide the dictionary (`.dic`) and rules (`.aff`) files for each language you wish to use with the Hunspell Stem Filter. You can download those language files http://wiki.services.openoffice.org/wiki/Dictionaries[here].
+
+Be aware that your results will vary widely based on the quality of the provided dictionary and rules files. For example, some languages have only a minimal word list with no morphological information. On the other hand, for languages that have no stemmer but do have an extensive dictionary file, the Hunspell stemmer may be a good choice.
+
+*Factory class:* `solr.HunspellStemFilterFactory`
+
+*Arguments:*
+
+`dictionary`:: (required) The path of a dictionary file.
+
+`affix`:: (required) The path of a rules file.
+
+`ignoreCase`:: (boolean) controls whether matching is case sensitive or not. The default is false.
+
+`strictAffixParsing`:: (boolean) controls whether the affix parsing is strict or not. If true, an error while reading an affix rule causes a ParseException, otherwise is ignored. The default is true.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer type="index">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.HunspellStemFilterFactory"
+    dictionary="en_GB.dic"
+    affix="en_GB.aff"
+    ignoreCase="true"
+    strictAffixParsing="true" />
+</analyzer>
+----
+
+*In:* "jump jumping jumped"
+
+*Tokenizer to Filter:* "jump", "jumping", "jumped"
+
+*Out:* "jump", "jump", "jump"
+
+[[FilterDescriptions-HyphenatedWordsFilter]]
+== Hyphenated Words Filter
+
+This filter reconstructs hyphenated words that have been tokenized as two tokens because of a line break or other intervening whitespace in the field test. If a token ends with a hyphen, it is joined with the following token and the hyphen is discarded.
+
+Note that for this filter to work properly, the upstream tokenizer must not remove trailing hyphen characters. This filter is generally only useful at index time.
+
+*Factory class:* `solr.HyphenatedWordsFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+[source,xml]
+----
+<analyzer type="index">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.HyphenatedWordsFilterFactory"/>
+</analyzer>
+----
+
+*In:* "A hyphen- ated word"
+
+*Tokenizer to Filter:* "A", "hyphen-", "ated", "word"
+
+*Out:* "A", "hyphenated", "word"
+
+[[FilterDescriptions-ICUFoldingFilter]]
+== ICU Folding Filter
+
+This filter is a custom Unicode normalization form that applies the foldings specified in http://www.unicode.org/reports/tr30/tr30-4.html[Unicode Technical Report 30] in addition to the `NFKC_Casefold` normalization form as described in <<FilterDescriptions-ICUNormalizer2Filter,ICU Normalizer 2 Filter>>. This filter is a better substitute for the combined behavior of the <<FilterDescriptions-ASCIIFoldingFilter,ASCII Folding Filter>>, <<FilterDescriptions-LowerCaseFilter,Lower Case Filter>>, and <<FilterDescriptions-ICUNormalizer2Filter,ICU Normalizer 2 Filter>>.
+
+To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`. For more information about adding jars, see the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in Solrconfig>>.
+
+*Factory class:* `solr.ICUFoldingFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.ICUFoldingFilterFactory"/>
+</analyzer>
+----
+
+For detailed information on this normalization form, see http://www.unicode.org/reports/tr30/tr30-4.html.
+
+[[FilterDescriptions-ICUNormalizer2Filter]]
+== ICU Normalizer 2 Filter
+
+This filter factory normalizes text according to one of five Unicode Normalization Forms as described in http://unicode.org/reports/tr15/[Unicode Standard Annex #15]:
+
+* NFC: (name="nfc" mode="compose") Normalization Form C, canonical decomposition
+* NFD: (name="nfc" mode="decompose") Normalization Form D, canonical decomposition, followed by canonical composition
+* NFKC: (name="nfkc" mode="compose") Normalization Form KC, compatibility decomposition
+* NFKD: (name="nfkc" mode="decompose") Normalization Form KD, compatibility decomposition, followed by canonical composition
+* NFKC_Casefold: (name="nfkc_cf" mode="compose") Normalization Form KC, with additional Unicode case folding. Using the ICU Normalizer 2 Filter is a better-performing substitution for the <<Lower Case Filter>> and NFKC normalization.
+
+*Factory class:* `solr.ICUNormalizer2FilterFactory`
+
+*Arguments:*
+
+`name`:: (string) The name of the normalization form; `nfc`, `nfd`, `nfkc`, `nfkd`, `nfkc_cf`
+
+`mode`:: (string) The mode of Unicode character composition and decomposition; `compose` or `decompose`
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.ICUNormalizer2FilterFactory" name="nfkc_cf" mode="compose"/>
+</analyzer>
+----
+
+For detailed information about these Unicode Normalization Forms, see http://unicode.org/reports/tr15/.
+
+To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
+
+[[FilterDescriptions-ICUTransformFilter]]
+== ICU Transform Filter
+
+This filter applies http://userguide.icu-project.org/transforms/general[ICU Tranforms] to text. This filter supports only ICU System Transforms. Custom rule sets are not supported.
+
+*Factory class:* `solr.ICUTransformFilterFactory`
+
+*Arguments:*
+
+`id`:: (string) The identifier for the ICU System Transform you wish to apply with this filter. For a full list of ICU System Transforms, see http://demo.icu-project.org/icu-bin/translit?TEMPLATE_FILE=data/translit_rule_main.html.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.ICUTransformFilterFactory" id="Traditional-Simplified"/>
+</analyzer>
+----
+
+For detailed information about ICU Transforms, see http://userguide.icu-project.org/transforms/general.
+
+To use this filter, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
+
+[[FilterDescriptions-KeepWordFilter]]
+== Keep Word Filter
+
+This filter discards all tokens except those that are listed in the given word list. This is the inverse of the Stop Words Filter. This filter can be useful for building specialized indices for a constrained set of terms.
+
+*Factory class:* `solr.KeepWordFilterFactory`
+
+*Arguments:*
+
+`words`:: (required) Path of a text file containing the list of keep words, one per line. Blank lines and lines that begin with "#" are ignored. This may be an absolute path, or a simple filename in the Solr `conf` directory.
+
+`ignoreCase`:: (true/false) If *true* then comparisons are done case-insensitively. If this argument is true, then the words file is assumed to contain only lowercase words. The default is *false*.
+
+`enablePositionIncrements`:: if `luceneMatchVersion` is `4.3` or earlier and `enablePositionIncrements="false"`, no position holes will be left by this filter when it removes tokens. *This argument is invalid if `luceneMatchVersion` is `5.0` or later.*
+
+*Example:*
+
+Where `keepwords.txt` contains:
+
+`happy funny silly`
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.KeepWordFilterFactory" words="keepwords.txt"/>
+</analyzer>
+----
+
+*In:* "Happy, sad or funny"
+
+*Tokenizer to Filter:* "Happy", "sad", "or", "funny"
+
+*Out:* "funny"
+
+*Example:*
+
+Same `keepwords.txt`, case insensitive:
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.KeepWordFilterFactory" words="keepwords.txt" ignoreCase="true"/>
+</analyzer>
+----
+
+*In:* "Happy, sad or funny"
+
+*Tokenizer to Filter:* "Happy", "sad", "or", "funny"
+
+*Out:* "Happy", "funny"
+
+*Example:*
+
+Using LowerCaseFilterFactory before filtering for keep words, no `ignoreCase` flag.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.LowerCaseFilterFactory"/>
+  <filter class="solr.KeepWordFilterFactory" words="keepwords.txt"/>
+</analyzer>
+----
+
+*In:* "Happy, sad or funny"
+
+*Tokenizer to Filter:* "Happy", "sad", "or", "funny"
+
+*Filter to Filter:* "happy", "sad", "or", "funny"
+
+*Out:* "happy", "funny"
+
+[[FilterDescriptions-KStemFilter]]
+== KStem Filter
+
+KStem is an alternative to the Porter Stem Filter for developers looking for a less aggressive stemmer. KStem was written by Bob Krovetz, ported to Lucene by Sergio Guzman-Lara (UMASS Amherst). This stemmer is only appropriate for English language text.
+
+*Factory class:* `solr.KStemFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+[source,xml]
+----
+<analyzer type="index">
+  <tokenizer class="solr.StandardTokenizerFactory "/>
+  <filter class="solr.KStemFilterFactory"/>
+</analyzer>
+----
+
+*In:* "jump jumping jumped"
+
+*Tokenizer to Filter:* "jump", "jumping", "jumped"
+
+*Out:* "jump", "jump", "jump"
+
+[[FilterDescriptions-LengthFilter]]
+== Length Filter
+
+This filter passes tokens whose length falls within the min/max limit specified. All other tokens are discarded.
+
+*Factory class:* `solr.LengthFilterFactory`
+
+*Arguments:*
+
+`min`:: (integer, required) Minimum token length. Tokens shorter than this are discarded.
+
+`max`:: (integer, required, must be >= min) Maximum token length. Tokens longer than this are discarded.
+
+`enablePositionIncrements`:: if `luceneMatchVersion` is `4.3` or earlier and `enablePositionIncrements="false"`, no position holes will be left by this filter when it removes tokens. *This argument is invalid if `luceneMatchVersion` is `5.0` or later.*
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.LengthFilterFactory" min="3" max="7"/>
+</analyzer>
+----
+
+*In:* "turn right at Albuquerque"
+
+*Tokenizer to Filter:* "turn", "right", "at", "Albuquerque"
+
+*Out:* "turn", "right"
+
+[[FilterDescriptions-LimitTokenCountFilter]]
+== Limit Token Count Filter
+
+This filter limits the number of accepted tokens, typically useful for index analysis.
+
+By default, this filter ignores any tokens in the wrapped `TokenStream` once the limit has been reached, which can result in `reset()` being called prior to `incrementToken()` returning `false`. For most `TokenStream` implementations this should be acceptable, and faster then consuming the full stream. If you are wrapping a `TokenStream` which requires that the full stream of tokens be exhausted in order to function properly, use the `consumeAllTokens="true"` option.
+
+*Factory class:* `solr.LimitTokenCountFilterFactory`
+
+*Arguments:*
+
+`maxTokenCount`:: (integer, required) Maximum token count. After this limit has been reached, tokens are discarded.
+
+`consumeAllTokens`:: (boolean, defaults to false) Whether to consume (and discard) previous token filters' tokens after the maximum token count has been reached. See description above.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer type="index">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="10"
+          consumeAllTokens="false" />
+</analyzer>
+----
+
+*In:* "1 2 3 4 5 6 7 8 9 10 11 12"
+
+*Tokenizer to Filter:* "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"
+
+*Out:* "1", "2", "3", "4", "5", "6", "7", "8", "9", "10"
+
+[[FilterDescriptions-LimitTokenOffsetFilter]]
+== Limit Token Offset Filter
+
+This filter limits tokens to those before a configured maximum start character offset. This can be useful to limit highlighting, for example.
+
+By default, this filter ignores any tokens in the wrapped `TokenStream` once the limit has been reached, which can result in `reset()` being called prior to `incrementToken()` returning `false`. For most `TokenStream` implementations this should be acceptable, and faster then consuming the full stream. If you are wrapping a `TokenStream` which requires that the full stream of tokens be exhausted in order to function properly, use the `consumeAllTokens="true"` option.
+
+*Factory class:* `solr.LimitTokenOffsetFilterFactory`
+
+*Arguments:*
+
+`maxStartOffset`:: (integer, required) Maximum token start character offset. After this limit has been reached, tokens are discarded.
+
+`consumeAllTokens`:: (boolean, defaults to false) Whether to consume (and discard) previous token filters' tokens after the maximum start offset has been reached. See description above.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.LimitTokenOffsetFilterFactory" maxStartOffset="10"
+          consumeAllTokens="false" />
+</analyzer>
+----
+
+*In:* "0 2 4 6 8 A C E"
+
+*Tokenizer to Filter:* "0", "2", "4", "6", "8", "A", "C", "E"
+
+*Out:* "0", "2", "4", "6", "8", "A"
+
+[[FilterDescriptions-LimitTokenPositionFilter]]
+== Limit Token Position Filter
+
+This filter limits tokens to those before a configured maximum token position.
+
+By default, this filter ignores any tokens in the wrapped `TokenStream` once the limit has been reached, which can result in `reset()` being called prior to `incrementToken()` returning `false`. For most `TokenStream` implementations this should be acceptable, and faster then consuming the full stream. If you are wrapping a `TokenStream` which requires that the full stream of tokens be exhausted in order to function properly, use the `consumeAllTokens="true"` option.
+
+*Factory class:* `solr.LimitTokenPositionFilterFactory`
+
+*Arguments:*
+
+`maxTokenPosition`:: (integer, required) Maximum token position. After this limit has been reached, tokens are discarded.
+
+`consumeAllTokens`:: (boolean, defaults to false) Whether to consume (and discard) previous token filters' tokens after the maximum start offset has been reached. See description above.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.LimitTokenPositionFilterFactory" maxTokenPosition="3"
+          consumeAllTokens="false" />
+</analyzer>
+----
+
+*In:* "1 2 3 4 5"
+
+*Tokenizer to Filter:* "1", "2", "3", "4", "5"
+
+*Out:* "1", "2", "3"
+
+[[FilterDescriptions-LowerCaseFilter]]
+== Lower Case Filter
+
+Converts any uppercase letters in a token to the equivalent lowercase token. All other characters are left unchanged.
+
+*Factory class:* `solr.LowerCaseFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.LowerCaseFilterFactory"/>
+</analyzer>
+----
+
+*In:* "Down With CamelCase"
+
+*Tokenizer to Filter:* "Down", "With", "CamelCase"
+
+*Out:* "down", "with", "camelcase"
+
+[[FilterDescriptions-ManagedStopFilter]]
+== Managed Stop Filter
+
+This is specialized version of the <<FilterDescriptions-StopFilter,Stop Words Filter Factory>> that uses a set of stop words that are <<managed-resources.adoc#managed-resources,managed from a REST API.>>
+
+*Arguments:*
+
+`managed`:: The name that should be used for this set of stop words in the managed REST API.
+
+*Example:*
+//TODO: make this show an actual API call.
+With this configuration the set of words is named "english" and can be managed via `/solr/collection_name/schema/analysis/stopwords/english`
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.ManagedStopFilterFactory" managed="english"/>
+</analyzer>
+----
+
+See <<FilterDescriptions-StopFilter,Stop Filter>> for example input/output.
+
+[[FilterDescriptions-ManagedSynonymFilter]]
+== Managed Synonym Filter
+
+This is specialized version of the <<FilterDescriptions-SynonymFilter,Synonym Filter Factory>> that uses a mapping on synonyms that is <<managed-resources.adoc#managed-resources,managed from a REST API.>>
+
+*Arguments:*
+
+`managed`:: The name that should be used for this mapping on synonyms in the managed REST API.
+
+*Example:*
+//TODO: make this show an actual API call
+With this configuration the set of mappings is named "english" and can be managed via `/solr/collection_name/schema/analysis/synonyms/english`
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.ManagedSynonymFilterFactory" managed="english"/>
+</analyzer>
+----
+
+See <<FilterDescriptions-SynonymFilter,Synonym Filter>> for example input/output.
+
+[[FilterDescriptions-N-GramFilter]]
+== N-Gram Filter
+
+Generates n-gram tokens of sizes in the given range. Note that tokens are ordered by position and then by gram size.
+
+*Factory class:* `solr.NGramFilterFactory`
+
+*Arguments:*
+
+`minGramSize`:: (integer, default 1) The minimum gram size.
+
+`maxGramSize`:: (integer, default 2) The maximum gram size.
+
+*Example:*
+
+Default behavior.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.NGramFilterFactory"/>
+</analyzer>
+----
+
+*In:* "four score"
+
+*Tokenizer to Filter:* "four", "score"
+
+*Out:* "f", "o", "u", "r", "fo", "ou", "ur", "s", "c", "o", "r", "e", "sc", "co", "or", "re"
+
+*Example:*
+
+A range of 1 to 4.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.NGramFilterFactory" minGramSize="1" maxGramSize="4"/>
+</analyzer>
+----
+
+*In:* "four score"
+
+*Tokenizer to Filter:* "four", "score"
+
+*Out:* "f", "fo", "fou", "four", "o", "ou", "our", "u", "ur", "r", "s", "sc", "sco", "scor", "c", "co", "cor", "core", "o", "or", "ore", "r", "re", "e"
+
+*Example:*
+
+A range of 3 to 5.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.NGramFilterFactory" minGramSize="3" maxGramSize="5"/>
+</analyzer>
+----
+
+*In:* "four score"
+
+*Tokenizer to Filter:* "four", "score"
+
+*Out:* "fou", "four", "our", "sco", "scor", "score", "cor", "core", "ore"
+
+[[FilterDescriptions-NumericPayloadTokenFilter]]
+== Numeric Payload Token Filter
+
+This filter adds a numeric floating point payload value to tokens that match a given type. Refer to the Javadoc for the `org.apache.lucene.analysis.Token` class for more information about token types and payloads.
+
+*Factory class:* `solr.NumericPayloadTokenFilterFactory`
+
+*Arguments:*
+
+`payload`:: (required) A floating point value that will be added to all matching tokens.
+
+`typeMatch`:: (required) A token type name string. Tokens with a matching type name will have their payload set to the above floating point value.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.NumericPayloadTokenFilterFactory" payload="0.75" typeMatch="word"/>
+</analyzer>
+----
+
+*In:* "bing bang boom"
+
+*Tokenizer to Filter:* "bing", "bang", "boom"
+
+*Out:* "bing"[0.75], "bang"[0.75], "boom"[0.75]
+
+[[FilterDescriptions-PatternReplaceFilter]]
+== Pattern Replace Filter
+
+This filter applies a regular expression to each token and, for those that match, substitutes the given replacement string in place of the matched pattern. Tokens which do not match are passed though unchanged.
+
+*Factory class:* `solr.PatternReplaceFilterFactory`
+
+*Arguments:*
+
+`pattern`:: (required) The regular expression to test against each token, as per `java.util.regex.Pattern`.
+
+`replacement`:: (required) A string to substitute in place of the matched pattern. This string may contain references to capture groups in the regex pattern. See the Javadoc for `java.util.regex.Matcher`.
+
+`replace`:: ("all" or "first", default "all") Indicates whether all occurrences of the pattern in the token should be replaced, or only the first.
+
+*Example:*
+
+Simple string replace:
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.PatternReplaceFilterFactory" pattern="cat" replacement="dog"/>
+</analyzer>
+----
+
+*In:* "cat concatenate catycat"
+
+*Tokenizer to Filter:* "cat", "concatenate", "catycat"
+
+*Out:* "dog", "condogenate", "dogydog"
+
+*Example:*
+
+String replacement, first occurrence only:
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.PatternReplaceFilterFactory" pattern="cat" replacement="dog" replace="first"/>
+</analyzer>
+----
+
+*In:* "cat concatenate catycat"
+
+*Tokenizer to Filter:* "cat", "concatenate", "catycat"
+
+*Out:* "dog", "condogenate", "dogycat"
+
+*Example:*
+
+More complex pattern with capture group reference in the replacement. Tokens that start with non-numeric characters and end with digits will have an underscore inserted before the numbers. Otherwise the token is passed through.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.PatternReplaceFilterFactory" pattern="(\D+)(\d+)$" replacement="$1_$2"/>
+</analyzer>
+----
+
+*In:* "cat foo1234 9987 blah1234foo"
+
+*Tokenizer to Filter:* "cat", "foo1234", "9987", "blah1234foo"
+
+*Out:* "cat", "foo_1234", "9987", "blah1234foo"
+
+[[FilterDescriptions-PhoneticFilter]]
+== Phonetic Filter
+
+This filter creates tokens using one of the phonetic encoding algorithms in the `org.apache.commons.codec.language` package. For more information, see the section on <<phonetic-matching.adoc#phonetic-matching,Phonetic Matching>>.
+
+*Factory class:* `solr.PhoneticFilterFactory`
+
+*Arguments:*
+
+`encoder`:: (required) The name of the encoder to use. The encoder name must be one of the following (case insensitive): `http://commons.apache.org/codec/apidocs/org/apache/commons/codec/language/DoubleMetaphone.html[DoubleMetaphone]`, `http://commons.apache.org/codec/apidocs/org/apache/commons/codec/language/Metaphone.html[Metaphone]`, `http://commons.apache.org/codec/apidocs/org/apache/commons/codec/language/Soundex.html[Soundex]`, `http://commons.apache.org/codec/apidocs/org/apache/commons/codec/language/RefinedSoundex.html[RefinedSoundex]`, `http://commons.apache.org/codec/apidocs/org/apache/commons/codec/language/Caverphone.html[Caverphone]` (v2.0), `http://commons.apache.org/codec/apidocs/org/apache/commons/codec/language/ColognePhonetic.html[ColognePhonetic]`, or `http://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/language/Nysiis.html[Nysiis]`.
+
+`inject`:: (true/false) If true (the default), then new phonetic tokens are added to the stream. Otherwise, tokens are replaced with the phonetic equivalent. Setting this to false will enable phonetic matching, but the exact spelling of the target word may not match.
+
+`maxCodeLength`:: (integer) The maximum length of the code to be generated by the Metaphone or Double Metaphone encoders.
+
+*Example:*
+
+Default behavior for DoubleMetaphone encoding.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone"/>
+</analyzer>
+----
+
+*In:* "four score and twenty"
+
+*Tokenizer to Filter:* "four"(1), "score"(2), "and"(3), "twenty"(4)
+
+*Out:* "four"(1), "FR"(1), "score"(2), "SKR"(2), "and"(3), "ANT"(3), "twenty"(4), "TNT"(4)
+
+The phonetic tokens have a position increment of 0, which indicates that they are at the same position as the token they were derived from (immediately preceding).
+
+*Example:*
+
+Discard original token.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone" inject="false"/>
+</analyzer>
+----
+
+*In:* "four score and twenty"
+
+*Tokenizer to Filter:* "four"(1), "score"(2), "and"(3), "twenty"(4)
+
+*Out:* "FR"(1), "SKR"(2), "ANT"(3), "TWNT"(4)
+
+*Example:*
+
+Default Soundex encoder.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.PhoneticFilterFactory" encoder="Soundex"/>
+</analyzer>
+----
+
+*In:* "four score and twenty"
+
+*Tokenizer to Filter:* "four"(1), "score"(2), "and"(3), "twenty"(4)
+
+*Out:* "four"(1), "F600"(1), "score"(2), "S600"(2), "and"(3), "A530"(3), "twenty"(4), "T530"(4)
+
+[[FilterDescriptions-PorterStemFilter]]
+== Porter Stem Filter
+
+This filter applies the Porter Stemming Algorithm for English. The results are similar to using the Snowball Porter Stemmer with the `language="English"` argument. But this stemmer is coded directly in Java and is not based on Snowball. It does not accept a list of protected words and is only appropriate for English language text. However, it has been benchmarked as http://markmail.org/thread/d2c443z63z37rwf6[four times faster] than the English Snowball stemmer, so can provide a performance enhancement.
+
+*Factory class:* `solr.PorterStemFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+[source,xml]
+----
+<analyzer type="index">
+  <tokenizer class="solr.StandardTokenizerFactory "/>
+  <filter class="solr.PorterStemFilterFactory"/>
+</analyzer>
+----
+
+*In:* "jump jumping jumped"
+
+*Tokenizer to Filter:* "jump", "jumping", "jumped"
+
+*Out:* "jump", "jump", "jump"
+
+[[FilterDescriptions-RemoveDuplicatesTokenFilter]]
+== Remove Duplicates Token Filter
+
+The filter removes duplicate tokens in the stream. Tokens are considered to be duplicates ONLY if they have the same text and position values.
+
+Because positions must be the same, this filter might not do what a user expects it to do based on its name. It is a very specialized filter that is only useful in very specific circumstances. It has been so named for brevity, even though it is potentially misleading.
+
+*Factory class:* `solr.RemoveDuplicatesTokenFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+One example of where `RemoveDuplicatesTokenFilterFactory` is useful in situations where a synonym file is being used in conjunction with a stemmer. In these situations, both the stemmer and the synonym filter can cause completely identical terms with the same positions to end up in the stream, increasing index size with no benefit.
+
+Consider the following entry from a `synonyms.txt` file:
+
+[source,text]
+----
+ Television, Televisions, TV, TVs
+----
+
+When used in the following configuration:
+
+[source,xml]
+----
+<analyzer type="query">
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt"/>
+  <filter class="solr.EnglishMinimalStemFilterFactory"/>
+  <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
+</analyzer>
+----
+
+*In:* "Watch TV"
+
+*Tokenizer to Synonym Filter:* "Watch"(1) "TV"(2)
+
+*Synonym Filter to Stem Filter:* "Watch"(1) "Television"(2) "Televisions"(2) "TV"(2) "TVs"(2)
+
+*Stem Filter to Remove Dups Filter:* "Watch"(1) "Television"(2) "Television"(2) "TV"(2) "TV"(2)
+
+*Out:* "Watch"(1) "Television"(2) "TV"(2)
+
+== Reversed Wildcard Filter
+
+This filter reverses tokens to provide faster leading wildcard and prefix queries. Tokens without wildcards are not reversed.
+
+*Factory class:* `solr.ReversedWildcardFilterFactory`
+
+*Arguments:*
+
+`withOriginal`:: (boolean) If true, the filter produces both original and reversed tokens at the same positions. If false, produces only reversed tokens.
+
+`maxPosAsterisk`:: (integer, default = 2) The maximum position of the asterisk wildcard ('*') that triggers the reversal of the query term. Terms with asterisks at positions above this value are not reversed.
+
+`maxPosQuestion`:: (integer, default = 1) The maximum position of the question mark wildcard ('?') that triggers the reversal of query term. To reverse only pure suffix queries (queries with a single leading asterisk), set this to 0 and `maxPosAsterisk` to 1.
+
+`maxFractionAsterisk`:: (float, default = 0.0) An additional parameter that triggers the reversal if asterisk ('*') position is less than this fraction of the query token length.
+
+`minTrailing`:: (integer, default = 2) The minimum number of trailing characters in a query token after the last wildcard character. For good performance this should be set to a value larger than 1.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer type="index">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true"
+    maxPosAsterisk="2" maxPosQuestion="1" minTrailing="2" maxFractionAsterisk="0"/>
+</analyzer>
+----
+
+*In:* "*foo *bar"
+
+*Tokenizer to Filter:* "*foo", "*bar"
+
+*Out:* "oof*", "rab*"
+
+[[FilterDescriptions-ShingleFilter]]
+== Shingle Filter
+
+This filter constructs shingles, which are token n-grams, from the token stream. It combines runs of tokens into a single token.
+
+*Factory class:* `solr.ShingleFilterFactory`
+
+*Arguments:*
+
+`minShingleSize`:: (integer, must be >= 2, default 2) The minimum number of tokens per shingle.
+
+`maxShingleSize`:: (integer, must be >= `minShingleSize`, default 2) The maximum number of tokens per shingle.
+
+`outputUnigrams`:: (boolean, default true) If true, then each individual token is also included at its original position.
+
+`outputUnigramsIfNoShingles`:: (boolean, default false) If true, then individual tokens will be output if no shingles are possible.
+
+`tokenSeparator`:: (string, default is " ") The string to use when joining adjacent tokens to form a shingle.
+
+*Example:*
+
+Default behavior.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.ShingleFilterFactory"/>
+</analyzer>
+----
+
+*In:* "To be, or what?"
+
+*Tokenizer to Filter:* "To"(1), "be"(2), "or"(3), "what"(4)
+
+*Out:* "To"(1), "To be"(1), "be"(2), "be or"(2), "or"(3), "or what"(3), "what"(4)
+
+*Example:*
+
+A shingle size of four, do not include original token.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.ShingleFilterFactory" maxShingleSize="4" outputUnigrams="false"/>
+</analyzer>
+----
+
+*In:* "To be, or not to be."
+
+*Tokenizer to Filter:* "To"(1), "be"(2), "or"(3), "not"(4), "to"(5), "be"(6)
+
+*Out:* "To be"(1), "To be or"(1), "To be or not"(1), "be or"(2), "be or not"(2), "be or not to"(2), "or not"(3), "or not to"(3), "or not to be"(3), "not to"(4), "not to be"(4), "to be"(5)
+
+[[FilterDescriptions-SnowballPorterStemmerFilter]]
+== Snowball Porter Stemmer Filter
+
+This filter factory instantiates a language-specific stemmer generated by Snowball. Snowball is a software package that generates pattern-based word stemmers. This type of stemmer is not as accurate as a table-based stemmer, but is faster and less complex. Table-driven stemmers are labor intensive to create and maintain and so are typically commercial products.
+
+Solr contains Snowball stemmers for Armenian, Basque, Catalan, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish and Turkish. For more information on Snowball, visit http://snowball.tartarus.org/.
+
+`StopFilterFactory`, `CommonGramsFilterFactory`, and `CommonGramsQueryFilterFactory` can optionally read stopwords in Snowball format (specify `format="snowball"` in the configuration of those FilterFactories).
+
+*Factory class:* `solr.SnowballPorterFilterFactory`
+
+*Arguments:*
+
+`language`:: (default "English") The name of a language, used to select the appropriate Porter stemmer to use. Case is significant. This string is used to select a package name in the `org.tartarus.snowball.ext` class hierarchy.
+
+`protected`:: Path of a text file containing a list of protected words, one per line. Protected words will not be stemmed. Blank lines and lines that begin with "#" are ignored. This may be an absolute path, or a simple file name in the Solr `conf` directory.
+
+*Example:*
+
+Default behavior:
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.SnowballPorterFilterFactory"/>
+</analyzer>
+----
+
+*In:* "flip flipped flipping"
+
+*Tokenizer to Filter:* "flip", "flipped", "flipping"
+
+*Out:* "flip", "flip", "flip"
+
+*Example:*
+
+French stemmer, English words:
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.SnowballPorterFilterFactory" language="French"/>
+</analyzer>
+----
+
+*In:* "flip flipped flipping"
+
+*Tokenizer to Filter:* "flip", "flipped", "flipping"
+
+*Out:* "flip", "flipped", "flipping"
+
+*Example:*
+
+Spanish stemmer, Spanish words:
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.SnowballPorterFilterFactory" language="Spanish"/>
+</analyzer>
+----
+
+*In:* "cante canta"
+
+*Tokenizer to Filter:* "cante", "canta"
+
+*Out:* "cant", "cant"
+
+[[FilterDescriptions-StandardFilter]]
+== Standard Filter
+
+This filter removes dots from acronyms and the substring "'s" from the end of tokens. This filter depends on the tokens being tagged with the appropriate term-type to recognize acronyms and words with apostrophes.
+
+*Factory class:* `solr.StandardFilterFactory`
+
+*Arguments:* None
+
+[WARNING]
+====
+This filter is no longer operational in Solr when the `luceneMatchVersion` (in `solrconfig.xml`) is higher than "3.1".
+====
+
+[[FilterDescriptions-StopFilter]]
+== Stop Filter
+
+This filter discards, or _stops_ analysis of, tokens that are on the given stop words list. A standard stop words list is included in the Solr `conf` directory, named `stopwords.txt`, which is appropriate for typical English language text.
+
+*Factory class:* `solr.StopFilterFactory`
+
+*Arguments:*
+
+`words`:: (optional) The path to a file that contains a list of stop words, one per line. Blank lines and lines that begin with "#" are ignored. This may be an absolute path, or path relative to the Solr `conf` directory.
+
+`format`:: (optional) If the stopwords list has been formatted for Snowball, you can specify `format="snowball"` so Solr can read the stopwords file.
+
+`ignoreCase`:: (true/false, default false) Ignore case when testing for stop words. If true, the stop list should contain lowercase words.
+
+`enablePositionIncrements`:: if `luceneMatchVersion` is `4.4` or earlier and `enablePositionIncrements="false"`, no position holes will be left by this filter when it removes tokens. *This argument is invalid if `luceneMatchVersion` is `5.0` or later.*
+
+*Example:*
+
+Case-sensitive matching, capitalized words not stopped. Token positions skip stopped words.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.StopFilterFactory" words="stopwords.txt"/>
+</analyzer>
+----
+
+*In:* "To be or what?"
+
+*Tokenizer to Filter:* "To"(1), "be"(2), "or"(3), "what"(4)
+
+*Out:* "To"(1), "what"(4)
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
+</analyzer>
+----
+
+*In:* "To be or what?"
+
+*Tokenizer to Filter:* "To"(1), "be"(2), "or"(3), "what"(4)
+
+*Out:* "what"(4)
+
+[[FilterDescriptions-SuggestStopFilter]]
+== Suggest Stop Filter
+
+Like <<FilterDescriptions-StopFilter,Stop Filter>>, this filter discards, or _stops_ analysis of, tokens that are on the given stop words list.
+
+Suggest Stop Filter differs from Stop Filter in that it will not remove the last token unless it is followed by a token separator. For example, a query "`find the`" would preserve the '`the`' since it was not followed by a space, punctuation etc., and mark it as a `KEYWORD` so that following filters will not change or remove it.
+
+By contrast, a query like "`find the popsicle`" would remove "```the```" as a stopword, since it's followed by a space. When using one of the analyzing suggesters, you would normally use the ordinary `StopFilterFactory` in your index analyzer and then SuggestStopFilter in your query analyzer.
+
+*Factory class:* `solr.SuggestStopFilterFactory`
+
+*Arguments:*
+
+`words`:: (optional; default: {lucene-javadocs}/analyzers-common/org/apache/lucene/analysis/core/StopAnalyzer.html[`StopAnalyzer#ENGLISH_STOP_WORDS_SET`] ) The name of a stopwords file to parse.
+
+`format`:: (optional; default: `wordset`) Defines how the words file will be parsed. If `words` is not specified, then `format` must not be specified. The valid values for the format option are:
+
+`wordset`:: This is the default format, which supports one word per line (including any intra-word whitespace) and allows whole line comments begining with the `#` character. Blank lines are ignored.
+
+`snowball`:: This format allows for multiple words specified on each line, and trailing comments may be specified using the vertical line (`|`). Blank lines are ignored.
+
+`ignoreCase`:: (optional; default: *false*) If *true*, matching is case-insensitive.
+
+*Example:*
+
+[source,xml]
+----
+<analyzer type="query">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.LowerCaseFilterFactory"/>
+  <filter class="solr.SuggestStopFilterFactory" ignoreCase="true"
+          words="stopwords.txt" format="wordset"/>
+</analyzer>
+----
+
+*In:* "The The"
+
+*Tokenizer to Filter:* "the"(1), "the"(2)
+
+*Out:* "the"(2)
+
+[[FilterDescriptions-SynonymFilter]]
+== Synonym Filter
+
+This filter does synonym mapping. Each token is looked up in the list of synonyms and if a match is found, then the synonym is emitted in place of the token. The position value of the new tokens are set such they all occur at the same position as the original token.
+
+.Synonym Filter has been Deprecated
+[WARNING]
+====
+Synonym Filter has been deprecated in favor of Synonym Graph Filter, which is required for multi-term synonym support.
+====
+
+*Factory class:* `solr.SynonymFilterFactory`
+
+For arguments and examples, see the Synonym Graph Filter below.
+
+[[FilterDescriptions-SynonymGraphFilter]]
+== Synonym Graph Filter
+
+This filter maps single- or multi-token synonyms, producing a fully correct graph output. This filter is a replacement for the Synonym Filter, which produces incorrect graphs for multi-token synonyms.
+
+If you use this filter during indexing, you must follow it with a Flatten Graph Filter to squash tokens on top of one another like the Synonym Filter, because the indexer can't directly consume a graph. To get fully correct positional queries when your synonym replacements are multiple tokens, you should instead apply synonyms using this filter at query time.
+
+*Factory class:* `solr.SynonymGraphFilterFactory`
+
+*Arguments:*
+
+`synonyms`:: (required) The path of a file that contains a list of synonyms, one per line. In the (default) `solr` format - see the `format` argument below for alternatives - blank lines and lines that begin with "`#`" are ignored. This may be a comma-separated list of absolute paths, or paths relative to the Solr config directory.
++
+There are two ways to specify synonym mappings:
++
+* A comma-separated list of words. If the token matches any of the words, then all the words in the list are substituted, which will include the original token.
++
+* Two comma-separated lists of words with the symbol "=>" between them. If the token matches any word on the left, then the list on the right is substituted. The original token will not be included unless it is also in the list on the right.
+
+`ignoreCase`:: (optional; default: `false`) If `true`, synonyms will be matched case-insensitively.
+
+`expand`:: (optional; default: `true`) If `true`, a synonym will be expanded to all equivalent synonyms. If `false`, all equivalent synonyms will be reduced to the first in the list.
+
+`format`:: (optional; default: `solr`) Controls how the synonyms will be parsed. The short names `solr` (for {lucene-javadocs}/analyzers-common/org/apache/lucene/analysis/synonym/SolrSynonymParser.html[`SolrSynonymParser)`] and `wordnet` (for {lucene-javadocs}/analyzers-common/org/apache/lucene/analysis/synonym/WordnetSynonymParser.html[`WordnetSynonymParser`] ) are supported, or you may alternatively supply the name of your own {lucene-javadocs}/analyzers-common/org/apache/lucene/analysis/synonym/SynonymMap.Builder.html[`SynonymMap.Builder`] subclass.
+
+`tokenizerFactory`:: (optional; default: `WhitespaceTokenizerFactory`) The name of the tokenizer factory to use when parsing the synonyms file. Arguments with the name prefix `tokenizerFactory.*` will be supplied as init params to the specified tokenizer factory.
++
+Any arguments not consumed by the synonym filter factory, including those without the `tokenizerFactory.*` prefix, will also be supplied as init params to the tokenizer factory.
++
+If `tokenizerFactory` is specified, then `analyzer` may not be, and vice versa.
+
+`analyzer`:: (optional; default: `WhitespaceTokenizerFactory`) The name of the analyzer class to use when parsing the synonyms file. If `analyzer` is specified, then `tokenizerFactory` may not be, and vice versa.
+
+For the following examples, assume a synonyms file named `mysynonyms.txt`:
+
+[source,text]
+----
+couch,sofa,divan
+teh => the
+huge,ginormous,humungous => large
+small => tiny,teeny,weeny
+----
+
+*Example:*
+
+[source,xml]
+----
+<analyzer type="index">
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.SynonymGraphFilterFactory" synonyms="mysynonyms.txt"/>
+  <filter class="solr.FlattenGraphFilterFactory"/> <!-- required on index analyzers after graph filters -->
+</analyzer>
+<analyzer type="query">
+  <tokenizer class="solr.StandardTokenizerFactory"/>
+  <filter class="solr.SynonymGraphFilterFactory" synonyms="mysynonyms.txt"/>
+</analyzer>
+----
+
+*In:* "teh small couch"
+
+*Tokenizer to Filter:* "teh"(1), "small"(2), "couch"(3)
+
+*Out:* "the"(1), "tiny"(2), "teeny"(2), "weeny"(2), "couch"(3), "sofa"(3), "divan"(3)
+
+*Example:*
+
+*In:* "teh ginormous, humungous sofa"
+
+*Tokenizer to Filter:* "teh"(1), "ginormous"(2), "humungous"(3), "sofa"(4)
+
+*Out:* "the"(1), "large"(2), "large"(3), "couch"(4), "sofa"(4), "divan"(4)
+
+[[FilterDescriptions-TokenOffsetPayloadFilter]]
+== Token Offset Payload Filter
+
+This filter adds the numeric character offsets of the token as a payload value for that token.
+
+*Factory class:* `solr.TokenOffsetPayloadTokenFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.TokenOffsetPayloadTokenFilterFactory"/>
+</analyzer>
+----
+
+*In:* "bing bang boom"
+
+*Tokenizer to Filter:* "bing", "bang", "boom"
+
+*Out:* "bing"[0,4], "bang"[5,9], "boom"[10,14]
+
+[[FilterDescriptions-TrimFilter]]
+== Trim Filter
+
+This filter trims leading and/or trailing whitespace from tokens. Most tokenizers break tokens at whitespace, so this filter is most often used for special situations.
+
+*Factory class:* `solr.TrimFilterFactory`
+
+*Arguments:*
+
+`updateOffsets`:: if `luceneMatchVersion` is `4.3` or earlier and `updateOffsets="true"`, trimmed tokens' start and end offsets will be updated to those of the first and last characters (plus one) remaining in the token. *This argument is invalid if `luceneMatchVersion` is `5.0` or later.*
+
+*Example:*
+
+The PatternTokenizerFactory configuration used here splits the input on simple commas, it does not remove whitespace.
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.PatternTokenizerFactory" pattern=","/>
+  <filter class="solr.TrimFilterFactory"/>
+</analyzer>
+----
+
+*In:* "one, two , three ,four "
+
+*Tokenizer to Filter:* "one", " two ", " three ", "four "
+
+*Out:* "one", "two", "three", "four"
+
+[[FilterDescriptions-TypeAsPayloadFilter]]
+== Type As Payload Filter
+
+This filter adds the token's type, as an encoded byte sequence, as its payload.
+
+*Factory class:* `solr.TypeAsPayloadTokenFilterFactory`
+
+*Arguments:* None
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.TypeAsPayloadTokenFilterFactory"/>
+</analyzer>
+----
+
+*In:* "Pay Bob's I.O.U."
+
+*Tokenizer to Filter:* "Pay", "Bob's", "I.O.U."
+
+*Out:* "Pay"[<ALPHANUM>], "Bob's"[<APOSTROPHE>], "I.O.U."[<ACRONYM>]
+
+[[FilterDescriptions-TypeTokenFilter]]
+== Type Token Filter
+
+This filter blacklists or whitelists a specified list of token types, assuming the tokens have type metadata associated with them. For example, the <<tokenizers.adoc#Tokenizers-UAX29URLEmailTokenizer,UAX29 URL Email Tokenizer>> emits "<URL>" and "<EMAIL>" typed tokens, as well as other types. This filter would allow you to pull out only e-mail addresses from text as tokens, if you wish.
+
+*Factory class:* `solr.TypeTokenFilterFactory`
+
+*Arguments:*
+
+`types`:: Defines the location of a file of types to filter.
+
+`useWhitelist`:: If *true*, the file defined in `types` should be used as include list. If *false*, or undefined, the file defined in `types` is used as a blacklist.
+
+`enablePositionIncrements`:: if `luceneMatchVersion` is `4.3` or earlier and `enablePositionIncrements="false"`, no position holes will be left by this filter when it removes tokens. *This argument is invalid if `luceneMatchVersion` is `5.0` or later.*
+
+*Example:*
+
+[source,xml]
+----
+<analyzer>
+  <filter class="solr.TypeTokenFilterFactory" types="stoptypes.txt" useWhitelist="true"/>
+</analyzer>
+----
+
+[[FilterDescriptions-WordDelimiterFilter]]
+== Word Delimiter Filter
+
+This filter splits tokens at word delimiters.
+
+.Word Delimiter Filter has been Deprecated
+[WARNING]
+====
+Word Delimiter Filter has been deprecated in favor of Word Delimiter Graph Filter, which is required to produce a correct token graph so that e.g. phrase queries can work correctly.
+====
+
+*Factory class:* `solr.WordDelimiterFilterFactory`
+
+For a full description, including arguments and examples, see the Word Delimiter Graph Filter below.
+
+[[FilterDescriptions-WordDelimiterGraphFilter]]
+== Word Delimiter Graph Filter
+
+This filter splits tokens at word delimiters.
+
+If you use this filter during indexing, you must follow it with a Flatten Graph Filter to squash tokens on top of one another like the Word Delimiter Filter, because the indexer can't directly consume a graph. To get fully correct positional queries when tokens are split, you should instead use this filter at query time.
+
+Note: although this filter produces correct token graphs, it cannot consume an input token graph correctly.
+
+The rules for determining delimiters are determined as follows:
+
+* A change in case within a word: "CamelCase" -> "Camel", "Case". This can be disabled by setting `splitOnCaseChange="0"`.
+
+* A transition from alpha to numeric characters or vice versa: "Gonzo5000" -> "Gonzo", "5000" "4500XL" -> "4500", "XL". This can be disabled by setting `splitOnNumerics="0"`.
+
+* Non-alphanumeric characters (discarded): "hot-spot" -> "hot", "spot"
+
+* A trailing "'s" is removed: "O'Reilly's" -> "O", "Reilly"
+
+* Any leading or trailing delimiters are discarded: "--hot-spot--" -> "hot", "spot"
+
+*Factory class:* `solr.WordDelimiterGraphFilterFactory`
+
+*Arguments:*
+
+`generateWordParts`:: (integer, default 1) If non-zero, splits words at delimiters. For example:"CamelCase", "hot-spot" -> "Camel", "Case", "hot", "spot"
+
+`generateNumberParts`:: (integer, default 1) If non-zero, splits numeric strings at delimiters:"1947-32" ->*"1947", "32"
+
+`splitOnCaseChange`:: (integer, default 1) If 0, words are not split on camel-case changes:"BugBlaster-XL" -> "BugBlaster", "XL". Example 1 below illustrates the default (non-zero) splitting behavior.
+
+`splitOnNumerics`:: (integer, default 1) If 0, don't split words on transitions from alpha to numeric:"FemBot3000" -> "Fem", "Bot3000"
+
+`catenateWords`:: (integer, default 0) If non-zero, maximal runs of word parts will be joined: "hot-spot-sensor's" -> "hotspotsensor"
+
+`catenateNumbers`:: (integer, default 0) If non-zero, maximal runs of number parts will be joined: 1947-32" -> "194732"
+
+`catenateAll`:: (0/1, default 0) If non-zero, runs of word and number parts will be joined: "Zap-Master-9000" -> "ZapMaster9000"
+
+`preserveOriginal`:: (integer, default 0) If non-zero, the original token is preserved: "Zap-Master-9000" -> "Zap-Master-9000", "Zap", "Master", "9000"
+
+`protected`:: (optional) The pathname of a file that contains a list of protected words that should be passed through without splitting.
+
+`stemEnglishPossessive`:: (integer, default 1) If 1, strips the possessive `'s` from each subword.
+
+`types`:: (optional) The pathname of a file that contains *character => type* mappings, which enable customization of this filter's splitting behavior. Recognized character types: `LOWER`, `UPPER`, `ALPHA`, `DIGIT`, `ALPHANUM`, and `SUBWORD_DELIM`.
++
+The default for any character without a customized mapping is computed from Unicode character properties. Blank lines and comment lines starting with '#' are ignored. An example file:
++
+[source,text]
+----
+# Don't split numbers at '$', '.' or ','
+$ => DIGIT
+. => DIGIT
+\u002C => DIGIT
+
+# Don't split on ZWJ: http://en.wikipedia.org/wiki/Zero-width_joiner
+\u200D => ALPHANUM
+----
+
+*Example:*
+
+Default behavior. The whitespace tokenizer is used here to preserve non-alphanumeric characters.
+
+[source,xml]
+----
+<analyzer type="index">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.WordDelimiterGraphFilterFactory"/>
+  <filter class="solr.FlattenGraphFilterFactory"/> <!-- required on index analyzers after graph filters -->
+</analyzer>
+
+<analyzer type="query">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.WordDelimiterGraphFilterFactory"/>
+</analyzer>
+----
+
+*In:* "hot-spot RoboBlaster/9000 100XL"
+
+*Tokenizer to Filter:* "hot-spot", "RoboBlaster/9000", "100XL"
+
+*Out:* "hot", "spot", "Robo", "Blaster", "9000", "100", "XL"
+
+*Example:*
+
+Do not split on case changes, and do not generate number parts. Note that by not generating number parts, tokens containing only numeric parts are ultimately discarded.
+
+[source,xml]
+----
+<analyzer type="query">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.WordDelimiterGraphFilterFactory" generateNumberParts="0" splitOnCaseChange="0"/>
+</analyzer>
+----
+
+*In:* "hot-spot RoboBlaster/9000 100-42"
+
+*Tokenizer to Filter:* "hot-spot", "RoboBlaster/9000", "100-42"
+
+*Out:* "hot", "spot", "RoboBlaster", "9000"
+
+*Example:*
+
+Concatenate word parts and number parts, but not word and number parts that occur in the same token.
+
+[source,xml]
+----
+<analyzer type="query">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.WordDelimiterGraphFilterFactory" catenateWords="1" catenateNumbers="1"/>
+</analyzer>
+----
+
+*In:* "hot-spot 100+42 XL40"
+
+*Tokenizer to Filter:* "hot-spot"(1), "100+42"(2), "XL40"(3)
+
+*Out:* "hot"(1), "spot"(2), "hotspot"(2), "100"(3), "42"(4), "10042"(4), "XL"(5), "40"(6)
+
+*Example:*
+
+Concatenate all. Word and/or number parts are joined together.
+
+[source,xml]
+----
+<analyzer type="query">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.WordDelimiterGraphFilterFactory" catenateAll="1"/>
+</analyzer>
+----
+
+*In:* "XL-4000/ES"
+
+*Tokenizer to Filter:* "XL-4000/ES"(1)
+
+*Out:* "XL"(1), "4000"(2), "ES"(3), "XL4000ES"(3)
+
+*Example:*
+
+Using a protected words list that contains "AstroBlaster" and "XL-5000" (among others).
+
+[source,xml]
+----
+<analyzer type="query">
+  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
+  <filter class="solr.WordDelimiterGraphFilterFactory" protected="protwords.txt"/>
+</analyzer>
+----
+
+*In:* "FooBar AstroBlaster XL-5000 ==ES-34-"
+
+*Tokenizer to Filter:* "FooBar", "AstroBlaster", "XL-5000", "==ES-34-"
+
+*Out:* "FooBar", "FooBar", "AstroBlaster", "XL-5000", "ES", "34"

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/fonts/Inconsolata/Inconsolata-Bold.ttf
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/fonts/Inconsolata/Inconsolata-Bold.ttf b/solr/solr-ref-guide/src/fonts/Inconsolata/Inconsolata-Bold.ttf
new file mode 100755
index 0000000..809c1f5
Binary files /dev/null and b/solr/solr-ref-guide/src/fonts/Inconsolata/Inconsolata-Bold.ttf differ

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/fonts/Inconsolata/Inconsolata-Regular.ttf
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/fonts/Inconsolata/Inconsolata-Regular.ttf b/solr/solr-ref-guide/src/fonts/Inconsolata/Inconsolata-Regular.ttf
new file mode 100755
index 0000000..fc981ce
Binary files /dev/null and b/solr/solr-ref-guide/src/fonts/Inconsolata/Inconsolata-Regular.ttf differ

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/fonts/Inconsolata/OFL.txt
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/fonts/Inconsolata/OFL.txt b/solr/solr-ref-guide/src/fonts/Inconsolata/OFL.txt
new file mode 100755
index 0000000..6078b5e
--- /dev/null
+++ b/solr/solr-ref-guide/src/fonts/Inconsolata/OFL.txt
@@ -0,0 +1,92 @@
+Copyright 2006 The Inconsolata Project Authors
+This Font Software is licensed under the SIL Open Font License, Version 1.1.
+This license is copied below, and is also available with a FAQ at:
+http://scripts.sil.org/OFL
+
+
+-----------------------------------------------------------
+SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007
+-----------------------------------------------------------
+
+PREAMBLE
+The goals of the Open Font License (OFL) are to stimulate worldwide
+development of collaborative font projects, to support the font creation
+efforts of academic and linguistic communities, and to provide a free and
+open framework in which fonts may be shared and improved in partnership
+with others.
+
+The OFL allows the licensed fonts to be used, studied, modified and
+redistributed freely as long as they are not sold by themselves. The
+fonts, including any derivative works, can be bundled, embedded, 
+redistributed and/or sold with any software provided that any reserved
+names are not used by derivative works. The fonts and derivatives,
+however, cannot be released under any other type of license. The
+requirement for fonts to remain under this license does not apply
+to any document created using the fonts or their derivatives.
+
+DEFINITIONS
+"Font Software" refers to the set of files released by the Copyright
+Holder(s) under this license and clearly marked as such. This may
+include source files, build scripts and documentation.
+
+"Reserved Font Name" refers to any names specified as such after the
+copyright statement(s).
+
+"Original Version" refers to the collection of Font Software components as
+distributed by the Copyright Holder(s).
+
+"Modified Version" refers to any derivative made by adding to, deleting,
+or substituting -- in part or in whole -- any of the components of the
+Original Version, by changing formats or by porting the Font Software to a
+new environment.
+
+"Author" refers to any designer, engineer, programmer, technical
+writer or other person who contributed to the Font Software.
+
+PERMISSION & CONDITIONS
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of the Font Software, to use, study, copy, merge, embed, modify,
+redistribute, and sell modified and unmodified copies of the Font
+Software, subject to the following conditions:
+
+1) Neither the Font Software nor any of its individual components,
+in Original or Modified Versions, may be sold by itself.
+
+2) Original or Modified Versions of the Font Software may be bundled,
+redistributed and/or sold with any software, provided that each copy
+contains the above copyright notice and this license. These can be
+included either as stand-alone text files, human-readable headers or
+in the appropriate machine-readable metadata fields within text or
+binary files as long as those fields can be easily viewed by the user.
+
+3) No Modified Version of the Font Software may use the Reserved Font
+Name(s) unless explicit written permission is granted by the corresponding
+Copyright Holder. This restriction only applies to the primary font name as
+presented to the users.
+
+4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font
+Software shall not be used to promote, endorse or advertise any
+Modified Version, except to acknowledge the contribution(s) of the
+Copyright Holder(s) and the Author(s) or with their explicit written
+permission.
+
+5) The Font Software, modified or unmodified, in part or in whole,
+must be distributed entirely under this license, and must not be
+distributed under any other license. The requirement for fonts to
+remain under this license does not apply to any document created
+using the Font Software.
+
+TERMINATION
+This license becomes null and void if any of the above conditions are
+not met.
+
+DISCLAIMER
+THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
+OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL THE
+COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
+DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM
+OTHER DEALINGS IN THE FONT SOFTWARE.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/fonts/Noto_Sans/LICENSE.txt
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/fonts/Noto_Sans/LICENSE.txt b/solr/solr-ref-guide/src/fonts/Noto_Sans/LICENSE.txt
new file mode 100755
index 0000000..75b5248
--- /dev/null
+++ b/solr/solr-ref-guide/src/fonts/Noto_Sans/LICENSE.txt
@@ -0,0 +1,202 @@
+
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Bold.ttf
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Bold.ttf b/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Bold.ttf
new file mode 100755
index 0000000..6e00cdc
Binary files /dev/null and b/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Bold.ttf differ

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-BoldItalic.ttf
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-BoldItalic.ttf b/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-BoldItalic.ttf
new file mode 100755
index 0000000..51b7b29
Binary files /dev/null and b/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-BoldItalic.ttf differ

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Italic.ttf
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Italic.ttf b/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Italic.ttf
new file mode 100755
index 0000000..dc93fea
Binary files /dev/null and b/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Italic.ttf differ

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Regular.ttf
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Regular.ttf b/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Regular.ttf
new file mode 100755
index 0000000..9dd1019
Binary files /dev/null and b/solr/solr-ref-guide/src/fonts/Noto_Sans/NotoSans-Regular.ttf differ

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/fonts/fontawesome/FontAwesome.otf
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/fonts/fontawesome/FontAwesome.otf b/solr/solr-ref-guide/src/fonts/fontawesome/FontAwesome.otf
new file mode 100755
index 0000000..81c9ad9
Binary files /dev/null and b/solr/solr-ref-guide/src/fonts/fontawesome/FontAwesome.otf differ

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95968c69/solr/solr-ref-guide/src/fonts/fontawesome/fontawesome-webfont.eot
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/fonts/fontawesome/fontawesome-webfont.eot b/solr/solr-ref-guide/src/fonts/fontawesome/fontawesome-webfont.eot
new file mode 100755
index 0000000..84677bc
Binary files /dev/null and b/solr/solr-ref-guide/src/fonts/fontawesome/fontawesome-webfont.eot differ