You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@opennlp.apache.org by aa...@apache.org on 2023/04/25 08:00:22 UTC
[opennlp] branch main updated: OPENNLP-1482 : Documentation for OpenNLP Eval Test Data (#531)
This is an automated email from the ASF dual-hosted git repository.
aarora pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/opennlp.git
The following commit(s) were added to refs/heads/main by this push:
new 265e643c OPENNLP-1482 : Documentation for OpenNLP Eval Test Data (#531)
265e643c is described below
commit 265e643cd85fdeaea9a13b1e4e861bb1aa8226e3
Author: Atita Arora <at...@users.noreply.github.com>
AuthorDate: Tue Apr 25 10:00:15 2023 +0200
OPENNLP-1482 : Documentation for OpenNLP Eval Test Data (#531)
* OPENNLP-1482 : Documentation for OpenNLP Eval Test Data
* OPENNLP-1482 : PR Feedback accommodated
---
opennlp-docs/src/docbkx/evaltest.xml | 80 ++++++++++++++++++++++++++++++++++++
opennlp-docs/src/docbkx/opennlp.xml | 1 +
2 files changed, 81 insertions(+)
diff --git a/opennlp-docs/src/docbkx/evaltest.xml b/opennlp-docs/src/docbkx/evaltest.xml
new file mode 100644
index 00000000..2641fe84
--- /dev/null
+++ b/opennlp-docs/src/docbkx/evaltest.xml
@@ -0,0 +1,80 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
+"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"[
+]>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+<chapter id="opennlp.evaltest">
+<title>Evaluation Test Data</title>
+ <section id="opennlp.evaltest.whatisit">
+ <title>What is it ?</title>
+ <para>
+ The evaluation test data is the data used in the tests that evaluate functionality and performance of
+ OpenNLP.
+ These tests ensure reliability and can help identify potential bugs, errors, or performance issues.
+ </para>
+ <para>
+ The evaluation tests leverage the k-fold cross-validation procedure.
+ This technique works by dividing the evaluation data into <code>k</code> equally sized parts or folds.
+ The algorithm is then trained on <code>k-1</code> of the folds and tested on the remaining fold.
+ This process is repeated <code>k</code> times, so that each of the k-folds is used exactly once as the test data,
+ and the results of each fold are combined to produce an overall estimate of the algorithm's performance.
+ </para>
+ </section>
+ <section id="opennlp.evaltest.whereisit">
+ <title>Where is it?</title>
+ <para>
+ OpenNLP evaluation tests data is available at <ulink url="https://nightlies.apache.org/opennlp/">
+ https://nightlies.apache.org/opennlp/</ulink> (file name : <code>opennlp-data.zip</code>)
+ </para>
+ <para>
+ Here's a link to the evaluation-tests build on Jenkins:<ulink url="https://builds.apache.org/job/OpenNLP/">
+ https://builds.apache.org/job/OpenNLP/</ulink>
+ </para>
+ </section>
+ <section id="opennlp.evaltest.howtouseit">
+ <title>How to use the evaluation test data to run test?</title>
+ <para>
+ The evaluation tests data can be downloaded and saved in the desired directory and can be used to run
+ OpenNLP Evaluation Tests as below:
+ <screen>
+ <![CDATA[
+mvn test -DOPENNLP_DATA_DIR=/path/to/opennlp-eval-test-data/ -Peval-tests
+ ]]>
+ </screen>
+ </para>
+ </section>
+ <section id="opennlp.evaltest.howtochangeit">
+ <title>How to change evaluation data?</title>
+ <para>
+ OpenNLP Evaluation Tests use <code><ulink url="https://nightlies.apache.org/">nightlies.apache.org</ulink></code> to
+ share data for testing and releasing candidate build.
+ You can also upload the opennlp-data.zip to <code>nightlies.apache.org</code> as below:
+ <screen>
+ <![CDATA[
+curl -u your_asf_username -T ./opennlp-data.zip "https://nightlies.apache.org/opennlp/"
+ ]]>
+ </screen>
+ More information about changing the evaluation test data on <code>nightlies.apache.org</code> can be found
+ at :<ulink url="https://nightlies.apache.org/authoring.html">https://nightlies.apache.org/authoring.html
+ </ulink>
+ </para>
+ </section>
+</chapter>
diff --git a/opennlp-docs/src/docbkx/opennlp.xml b/opennlp-docs/src/docbkx/opennlp.xml
index 75184b91..00c55d53 100644
--- a/opennlp-docs/src/docbkx/opennlp.xml
+++ b/opennlp-docs/src/docbkx/opennlp.xml
@@ -92,4 +92,5 @@ under the License.
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="./uima-integration.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="./morfologik-addon.xml" />
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="./cli.xml" />
+ <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="evaltest.xml" />
</book>