You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@opennlp.apache.org by co...@apache.org on 2011/09/01 22:14:22 UTC
svn commit: r1164243 -
/incubator/opennlp/trunk/opennlp-docs/src/docbkx/sentdetect.xml
Author: colen
Date: Thu Sep 1 20:14:22 2011
New Revision: 1164243
URL: http://svn.apache.org/viewvc?rev=1164243&view=rev
Log:
OPENNLP-292 Updated usage of sentence detector training tool
Modified:
incubator/opennlp/trunk/opennlp-docs/src/docbkx/sentdetect.xml
Modified: incubator/opennlp/trunk/opennlp-docs/src/docbkx/sentdetect.xml
URL: http://svn.apache.org/viewvc/incubator/opennlp/trunk/opennlp-docs/src/docbkx/sentdetect.xml?rev=1164243&r1=1164242&r2=1164243&view=diff
==============================================================================
--- incubator/opennlp/trunk/opennlp-docs/src/docbkx/sentdetect.xml (original)
+++ incubator/opennlp/trunk/opennlp-docs/src/docbkx/sentdetect.xml Thu Sep 1 20:14:22 2011
@@ -133,15 +133,28 @@ Span sentences[] = sentenceDetector.sent
Usage of the tool:
<programlisting>
<![CDATA[
-$bin/opennlp SentenceDetectorTrainer
-Usage: opennlp SentenceDetectorTrainer -lang language -encoding charset [-iterations num] \
-[-cutoff num] -data trainingData -model model
--lang language specifies the language which is being processed.
--encoding charset specifies the encoding which should be used for reading and writing text.
--iterations num specified the number of training iterations
--cutoff num specifies the min number of times a feature must be seen]]>
+$ bin/opennlp SentenceDetectorTrainer
+Usage: opennlp SentenceDetectorTrainer [-abbDict path] [-params paramsFile] -lang language \
+[-cutoff num] [-iterations num] [-encoding charsetName] -data trainData -model modelFile
+Arguments description:
+ -abbDict path
+ The abbreviation dictionary in XML format.
+ -params paramsFile
+ Training parameters file.
+ -lang language
+ specifies the language which is being processed.
+ -cutoff num
+ specifies the min number of times a feature must be seen. It is ignored if a parameters file is passed.
+ -iterations num
+ specifies the number of training iterations. It is ignored if a parameters file is passed.
+ -encoding charsetName
+ specifies the encoding which should be used for reading and writing text. If not specified the system default will be used.
+ -data trainData
+ the data to be used during training
+ -model modelFile
+ the output model file]]>
</programlisting>
- To train an english sentence detector use the following command:
+ To train an English sentence detector use the following command:
<programlisting>
<![CDATA[
$bin/opennlp SentenceDetectorTrainer -encoding UTF-8 -lang en -data en-sent.train -model en-sent.bin