You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by sa...@apache.org on 2017/05/26 20:58:27 UTC
[1/2] lucene-solr:master: SOLR-10758: fix broken internal link to new
HMM Chinese Tokenizer section
Repository: lucene-solr
Updated Branches:
refs/heads/branch_6x 1e6ef4175 -> 4e32ab35f
refs/heads/master 6bbdfbc7c -> 9fbc9db1c
SOLR-10758: fix broken internal link to new HMM Chinese Tokenizer section
Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/9fbc9db1
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/9fbc9db1
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/9fbc9db1
Branch: refs/heads/master
Commit: 9fbc9db1c17d9fe5a7281f89a4bb18e18f38fceb
Parents: 6bbdfbc
Author: Steve Rowe <sa...@gmail.com>
Authored: Fri May 26 16:57:53 2017 -0400
Committer: Steve Rowe <sa...@gmail.com>
Committed: Fri May 26 16:57:53 2017 -0400
----------------------------------------------------------------------
solr/solr-ref-guide/src/language-analysis.adoc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/9fbc9db1/solr/solr-ref-guide/src/language-analysis.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/language-analysis.adoc b/solr/solr-ref-guide/src/language-analysis.adoc
index c82cd61..11b0b78 100644
--- a/solr/solr-ref-guide/src/language-analysis.adoc
+++ b/solr/solr-ref-guide/src/language-analysis.adoc
@@ -565,7 +565,7 @@ See the example under <<LanguageAnalysis-TraditionalChinese,Traditional Chinese>
[[LanguageAnalysis-SimplifiedChinese]]
=== Simplified Chinese
-For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the <<LanguageAnalysis-HMMChineseTokenizerFactory,HMM Chinese Tokenizer`>>. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
+For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the <<LanguageAnalysis-HMMChineseTokenizer,HMM Chinese Tokenizer`>>. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
The default configuration of the <<tokenizers.adoc#Tokenizers-ICUTokenizer,ICU Tokenizer>> is also suitable for Simplified Chinese text. It follows the Word Break rules from the Unicode Text Segmentation algorithm for non-Chinese text, and uses a dictionary to segment Chinese words. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
@@ -598,6 +598,7 @@ Also useful for Chinese analysis:
</analyzer>
----
+[[LanguageAnalysis-HMMChineseTokenizer]]
=== HMM Chinese Tokenizer
For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the `solr.HMMChineseTokenizerFactory` in the `analysis-extras` contrib module. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.
[2/2] lucene-solr:branch_6x: SOLR-10758: fix broken internal link to
new HMM Chinese Tokenizer section
Posted by sa...@apache.org.
SOLR-10758: fix broken internal link to new HMM Chinese Tokenizer section
Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/4e32ab35
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/4e32ab35
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/4e32ab35
Branch: refs/heads/branch_6x
Commit: 4e32ab35f98d2933aec708af387a7eed02e02792
Parents: 1e6ef41
Author: Steve Rowe <sa...@gmail.com>
Authored: Fri May 26 16:57:53 2017 -0400
Committer: Steve Rowe <sa...@gmail.com>
Committed: Fri May 26 16:58:17 2017 -0400
----------------------------------------------------------------------
solr/solr-ref-guide/src/language-analysis.adoc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/4e32ab35/solr/solr-ref-guide/src/language-analysis.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/language-analysis.adoc b/solr/solr-ref-guide/src/language-analysis.adoc
index c82cd61..11b0b78 100644
--- a/solr/solr-ref-guide/src/language-analysis.adoc
+++ b/solr/solr-ref-guide/src/language-analysis.adoc
@@ -565,7 +565,7 @@ See the example under <<LanguageAnalysis-TraditionalChinese,Traditional Chinese>
[[LanguageAnalysis-SimplifiedChinese]]
=== Simplified Chinese
-For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the <<LanguageAnalysis-HMMChineseTokenizerFactory,HMM Chinese Tokenizer`>>. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
+For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the <<LanguageAnalysis-HMMChineseTokenizer,HMM Chinese Tokenizer`>>. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
The default configuration of the <<tokenizers.adoc#Tokenizers-ICUTokenizer,ICU Tokenizer>> is also suitable for Simplified Chinese text. It follows the Word Break rules from the Unicode Text Segmentation algorithm for non-Chinese text, and uses a dictionary to segment Chinese words. To use this tokenizer, you must add additional .jars to Solr's classpath (as described in the section <<lib-directives-in-solrconfig.adoc#lib-directives-in-solrconfig,Lib Directives in SolrConfig>>). See the `solr/contrib/analysis-extras/README.txt` for information on which jars you need to add to your `SOLR_HOME/lib`.
@@ -598,6 +598,7 @@ Also useful for Chinese analysis:
</analyzer>
----
+[[LanguageAnalysis-HMMChineseTokenizer]]
=== HMM Chinese Tokenizer
For Simplified Chinese, Solr provides support for Chinese sentence and word segmentation with the `solr.HMMChineseTokenizerFactory` in the `analysis-extras` contrib module. This component includes a large dictionary and segments Chinese text into words with the Hidden Markov Model. To use this tokenizer, see `solr/contrib/analysis-extras/README.txt` for instructions on which jars you need to add to your `solr_home/lib`.