You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@opennlp.apache.org by ra...@apache.org on 2014/09/11 17:33:29 UTC
svn commit: r1624316 - in /opennlp/trunk/opennlp-docs/src/docbkx:
introduction.xml parser.xml
Author: ragerri
Date: Thu Sep 11 15:33:29 2014
New Revision: 1624316
URL: http://svn.apache.org/r1624316
Log:
OPENNLP-690 adding documentation for parser evaluator tool
Modified:
opennlp/trunk/opennlp-docs/src/docbkx/introduction.xml
opennlp/trunk/opennlp-docs/src/docbkx/parser.xml
Modified: opennlp/trunk/opennlp-docs/src/docbkx/introduction.xml
URL: http://svn.apache.org/viewvc/opennlp/trunk/opennlp-docs/src/docbkx/introduction.xml?rev=1624316&r1=1624315&r2=1624316&view=diff
==============================================================================
--- opennlp/trunk/opennlp-docs/src/docbkx/introduction.xml (original)
+++ opennlp/trunk/opennlp-docs/src/docbkx/introduction.xml Thu Sep 11 15:33:29 2014
@@ -178,6 +178,7 @@ where TOOL is one of:
ChunkerConverter converts ad data format to native OpenNLP format
Parser performs full syntactic parsing
ParserTrainer trains the learnable parser
+ ParserEvaluator Measures the performance of the Parser model with the reference data
BuildModelUpdater trains and updates the build model in a parser model
CheckModelUpdater trains and updates the check model in a parser model
TaggerModelReplacer replaces the tagger model in a parser model
Modified: opennlp/trunk/opennlp-docs/src/docbkx/parser.xml
URL: http://svn.apache.org/viewvc/opennlp/trunk/opennlp-docs/src/docbkx/parser.xml?rev=1624316&r1=1624315&r2=1624316&view=diff
==============================================================================
--- opennlp/trunk/opennlp-docs/src/docbkx/parser.xml (original)
+++ opennlp/trunk/opennlp-docs/src/docbkx/parser.xml Thu Sep 11 15:33:29 2014
@@ -35,7 +35,7 @@ under the License.
<para>
The easiest way to try out the Parser is the command line tool.
The tool is only intended for demonstration and testing.
- Download the english chunking parser model from the our website and start the Parse
+ Download the English chunking parser model from the our website and start the Parse
Tool with the following command.
<screen>
<![CDATA[
@@ -59,7 +59,7 @@ The quick brown fox jumps over the lazy
$ opennlp Parser en-parser.bin en-parser-chunking.bin < article-tokenized.txt > article-parsed.txt.]]>
</screen>
The article-tokenized.txt file must contain one sentence per line which is
- tokenized with the english tokenizer model from our website.
+ tokenized with the English tokenizer model from our website.
See the Tokenizer documentation for further details.
</para>
</section>
@@ -209,4 +209,52 @@ $ opennlp TaggerModelReplacer en-parser-
</para>
</section>
</section>
+ <section id="tools.parser.evaluation">
+ <title>Parser Evaluation</title>
+ <para>
+ The built in evaluation can measure the parser performance. The
+ performance is measured
+ on a test dataset.
+ </para>
+ <section id="tools.parser.evaluation.tool">
+ <title>Parser Evaluation Tool</title>
+ <para>
+ The following command shows how the tool can be run:
+ <screen>
+ <![CDATA[
+$ opennlp ParserEvaluator
+Usage: opennlp ParserEvaluator[.ontonotes|frenchtreebank] [-misclassified true|false] -model model \
+ -data sampleData [-encoding charsetName]]]>
+ </screen>
+ A sample of the command considering you have a data sample named
+ en-parser-chunking.eval
+ and you trained a model called en-parser-chunking.bin:
+ <screen>
+ <![CDATA[
+$ opennlp ParserEvaluator -model en-parser-chunking.bin -lang en -data en-parser-chunking.eval -encoding UTF-8]]>
+ </screen>
+ and here is a sample output:
+ <screen>
+ <![CDATA[
+Precision: 0.9009744742967609
+Recall: 0.8962012400910446
+F-Measure: 0.8985815184245214]]>
+ </screen>
+ </para>
+ <para>
+ The Parser Evaluation tool reimplements the PARSEVAL scoring method
+ as implemented by the
+ <ulink url=http://nlp.cs.nyu.edu/evalb />EVALB</ulink>
+ script, which is the most widely used evaluation
+ tool for constituent parsing. Note however that currently the Parser
+ Evaluation tool does not allow
+ to make exceptions in the constituents to be evaluated, in the way
+ Collins or Bikel usually do. Any
+ contributions are very welcome. If you want to contribute please contact us on
+ the mailing list or comment
+ on the jira issue
+ <ulink url="https://issues.apache.org/jira/browse/OPENNLP-688">OPENNLP-688</ulink>
+ </para>
+ </section>
+ </section>
</chapter>
\ No newline at end of file