You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uima.apache.org by to...@apache.org on 2010/06/04 12:27:31 UTC
svn commit: r951352 - in /uima/sandbox/trunk:
AlchemyAPIAnnotator/src/docbook/AlchemyAPIAnnotatorUserGuide.xml
ConfigurableFeatureExtractor/src/docbook/CFE_UG.xml
Tagger/src/docbook/hmmTaggerUsersGuide.xml
Author: tommaso
Date: Fri Jun 4 10:27:30 2010
New Revision: 951352
URL: http://svn.apache.org/viewvc?rev=951352&view=rev
Log:
[UIMA-1051] - fixed code snippets parts with longer lines going out of the margin of the page, fixed larger images going out of the margin of the page
Modified:
uima/sandbox/trunk/AlchemyAPIAnnotator/src/docbook/AlchemyAPIAnnotatorUserGuide.xml
uima/sandbox/trunk/ConfigurableFeatureExtractor/src/docbook/CFE_UG.xml
uima/sandbox/trunk/Tagger/src/docbook/hmmTaggerUsersGuide.xml
Modified: uima/sandbox/trunk/AlchemyAPIAnnotator/src/docbook/AlchemyAPIAnnotatorUserGuide.xml
URL: http://svn.apache.org/viewvc/uima/sandbox/trunk/AlchemyAPIAnnotator/src/docbook/AlchemyAPIAnnotatorUserGuide.xml?rev=951352&r1=951351&r2=951352&view=diff
==============================================================================
--- uima/sandbox/trunk/AlchemyAPIAnnotator/src/docbook/AlchemyAPIAnnotatorUserGuide.xml (original)
+++ uima/sandbox/trunk/AlchemyAPIAnnotator/src/docbook/AlchemyAPIAnnotatorUserGuide.xml Fri Jun 4 10:27:30 2010
@@ -34,7 +34,7 @@
AlchemyAPI
( <ulink url="http://www.alchemyapi.com">http://www.alchemyapi.com</ulink>)
services provided by
- Orchestr8 ( http://www.orchestr8.net ).
+ Orchestr8 ( <ulink url="http://www.orchestr8.net">http://www.orchestr8.net</ulink> ).
</para>
<para>
To use AlchemyAPI Annotator, choose which service you want to
@@ -168,102 +168,296 @@
<itemizedlist>
<listitem>
<para>Afrikaans ISO-639-3: afr</para>
+ </listitem>
+ <listitem>
<para>Albanian ISO-639-3: sqi</para>
+ </listitem>
+ <listitem>
<para>Amharic ISO-639-3: amh</para>
+ </listitem>
+ <listitem>
<para>Amuzgo Guerrero ISO-639-3: amu</para>
+ </listitem>
+ <listitem>
<para>Arabic ISO-639-3: ara</para>
+ </listitem>
+ <listitem>
<para>Armenian ISO-639-3: hye</para>
+ </listitem>
+ <listitem>
<para>Azerbaijani ISO-639-3: aze</para>
+ </listitem>
+ <listitem>
<para>Basque ISO-639-3: eus</para>
+ </listitem>
+ <listitem>
<para>Breton ISO-639-3: bre</para>
+ </listitem>
+ <listitem>
<para>Bulgarian ISO-639-3: bul</para>
+ </listitem>
+ <listitem>
<para>Catalan ISO-639-3: cat</para>
+ </listitem>
+ <listitem>
<para>Cebuano ISO-639-3: ceb</para>
+ </listitem>
+ <listitem>
<para>Central K'iche' ISO-639-3: qut</para>
+ </listitem>
+ <listitem>
<para>Central Mam ISO-639-3: mvc</para>
+ </listitem>
+ <listitem>
<para>Chamorro ISO-639-3: cha</para>
+ </listitem>
+ <listitem>
<para>Cherokee ISO-639-3: chr</para>
+ </listitem>
+ <listitem>
<para>Chinese ISO-639-3: zho</para>
+ </listitem>
+ <listitem>
<para>Comaltepec Chinantec ISO-639-3: cco</para>
+ </listitem>
+ <listitem>
<para>Comaltepec Chinantec ISO-639-3: cco</para>
+ </listitem>
+ <listitem>
<para>Croatian ISO-639-3: hrv</para>
+ </listitem>
+ <listitem>
<para>Cubulco Achi' ISO-639-3: acc</para>
+ </listitem>
+ <listitem>
<para>Czech ISO-639-3: ces</para>
+ </listitem>
+ <listitem>
<para>Dakota ISO-639-3: dak</para>
+ </listitem>
+ <listitem>
<para>Danish ISO-639-3: dan</para>
+ </listitem>
+ <listitem>
<para>Dutch ISO-639-3: nld</para>
+ </listitem>
+ <listitem>
<para>English ISO-639-3: eng</para>
+ </listitem>
+ <listitem>
<para>Esperanto ISO-639-3: epo</para>
+ </listitem>
+ <listitem>
<para>Estonian ISO-639-3: est</para>
+ </listitem>
+ <listitem>
<para>Faroese ISO-639-3: fao</para>
+ </listitem>
+ <listitem>
<para>Fijian ISO-639-3: fij</para>
+ </listitem>
+ <listitem>
<para>Finnish ISO-639-3: fin</para>
+ </listitem>
+ <listitem>
<para>French ISO-639-3: fra</para>
+ </listitem>
+ <listitem>
<para>Fulfulde Adamawa ISO-639-3: fub</para>
+ </listitem>
+ <listitem>
<para>Georgian ISO-639-3: kat</para>
+ </listitem>
+ <listitem>
<para>German ISO-639-3: deu</para>
+ </listitem>
+ <listitem>
<para>Greek ISO-639-3: ell</para>
+ </listitem>
+ <listitem>
<para>Guerrero Nahuatl ISO-639-3: ngu</para>
+ </listitem>
+ <listitem>
<para>Gujarti ISO-639-3: guj</para>
+ </listitem>
+ <listitem>
<para>Haitian Creole ISO-639-3: hat</para>
+ </listitem>
+ <listitem>
<para>Hausa ISO-639-3: hau</para>
+ </listitem>
+ <listitem>
<para>Hawaiian ISO-639-3: haw</para>
+ </listitem>
+ <listitem>
<para>Hebrew ISO-639-3: heb</para>
+ </listitem>
+ <listitem>
<para>Hiligaynon ISO-639-3: hil</para>
+ </listitem>
+ <listitem>
<para>Hindi ISO-639-3: hin</para>
+ </listitem>
+ <listitem>
<para>Hungarian ISO-639-3: hun</para>
+ </listitem>
+ <listitem>
<para>Icelandic ISO-639-3: isl</para>
+ </listitem>
+ <listitem>
<para>Indonesian ISO-639-3: ind</para>
+ </listitem>
+ <listitem>
<para>Irish ISO-639-3: gle</para>
+ </listitem>
+ <listitem>
<para>Italian ISO-639-3: ita</para>
+ </listitem>
+ <listitem>
<para>Jacalteco ISO-639-3: jac</para>
+ </listitem>
+ <listitem>
<para>Japanese ISO-639-3: jpn</para>
+ </listitem>
+ <listitem>
<para>Kabyle ISO-639-3: kab</para>
+ </listitem>
+ <listitem>
<para>Kaqchikel ISO-639-3: cak</para>
+ </listitem>
+ <listitem>
<para>Kirghiz ISO-639-3: kir</para>
+ </listitem>
+ <listitem>
<para>Kisongye ISO-639-3: sop</para>
+ </listitem>
+ <listitem>
<para>Korean ISO-639-3: kor</para>
+ </listitem>
+ <listitem>
<para>Latin ISO-639-3: lat</para>
+ </listitem>
+ <listitem>
<para>Latvian ISO-639-3: lav</para>
+ </listitem>
+ <listitem>
<para>Lithuanian ISO-639-3: lit</para>
+ </listitem>
+ <listitem>
<para>Low Saxon ISO-639-3: nds</para>
+ </listitem>
+ <listitem>
<para>Macedonian ISO-639-3: mkd</para>
+ </listitem>
+ <listitem>
<para>Malay ISO-639-3: msa</para>
+ </listitem>
+ <listitem>
<para>Maltese ISO-639-3: mlt</para>
+ </listitem>
+ <listitem>
<para>Maori ISO-639-3: mri</para>
+ </listitem>
+ <listitem>
<para>Micmac ISO-639-3: mic</para>
+ </listitem>
+ <listitem>
<para>Mòoré ISO-639-3: mos</para>
+ </listitem>
+ <listitem>
<para>Ndebele ISO-639-3: nde</para>
+ </listitem>
+ <listitem>
<para>Nepali ISO-639-3: nep</para>
+ </listitem>
+ <listitem>
<para>Norwegian ISO-639-3: nor</para>
+ </listitem>
+ <listitem>
<para>Ojibwa ISO-639-3: oji</para>
+ </listitem>
+ <listitem>
<para>Pashto ISO-639-3: pus</para>
+ </listitem>
+ <listitem>
<para>Persian ISO-639-3: fas</para>
+ </listitem>
+ <listitem>
<para>Polish ISO-639-3: pol</para>
+ </listitem>
+ <listitem>
<para>Portuguese ISO-639-3: por</para>
+ </listitem>
+ <listitem>
<para>Q'eqchi' ISO-639-3: kek</para>
+ </listitem>
+ <listitem>
<para>Romanian ISO-639-3: ron</para>
+ </listitem>
+ <listitem>
<para>Romani ISO-639-3: rom</para>
+ </listitem>
+ <listitem>
<para>Russian ISO-639-3: rus</para>
+ </listitem>
+ <listitem>
<para>Serbian ISO-639-3: srp</para>
+ </listitem>
+ <listitem>
<para>Shona ISO-639-3: sna</para>
+ </listitem>
+ <listitem>
<para>Shuar ISO-639-3: jiv</para>
+ </listitem>
+ <listitem>
<para>Slovak ISO-639-3: slk</para>
+ </listitem>
+ <listitem>
<para>Slovenian ISO-639-3: slv</para>
+ </listitem>
+ <listitem>
<para>Spanish ISO-639-3: spa</para>
+ </listitem>
+ <listitem>
<para>Swahili ISO-639-3: swa</para>
+ </listitem>
+ <listitem>
<para>Swedish ISO-639-3: swe</para>
+ </listitem>
+ <listitem>
<para>Tagalog ISO-639-3: tgl</para>
+ </listitem>
+ <listitem>
<para>Thai ISO-639-3: tha</para>
+ </listitem>
+ <listitem>
<para>Todos Santos Cuchumatan Mám ISO-639-3: mvj</para>
+ </listitem>
+ <listitem>
<para>Turkish ISO-639-3: tur</para>
+ </listitem>
+ <listitem>
<para>Ukrainian ISO-639-3: ukr</para>
+ </listitem>
+ <listitem>
<para>Urdu ISO-639-3: urd</para>
+ </listitem>
+ <listitem>
<para>Uspanteco ISO-639-3: usp</para>
+ </listitem>
+ <listitem>
<para>Vietnamese ISO-639-3: vie</para>
+ </listitem>
+ <listitem>
<para>Welsh ISO-639-3: cym</para>
+ </listitem>
+ <listitem>
<para>Wolof ISO-639-3: wol</para>
+ </listitem>
+ <listitem>
<para>Xhosa ISO-639-3: xho</para>
+ </listitem>
+ <listitem>
<para>Zarma ISO-639-3: ssa</para>
</listitem>
</itemizedlist>
Modified: uima/sandbox/trunk/ConfigurableFeatureExtractor/src/docbook/CFE_UG.xml
URL: http://svn.apache.org/viewvc/uima/sandbox/trunk/ConfigurableFeatureExtractor/src/docbook/CFE_UG.xml?rev=951352&r1=951351&r2=951352&view=diff
==============================================================================
--- uima/sandbox/trunk/ConfigurableFeatureExtractor/src/docbook/CFE_UG.xml (original)
+++ uima/sandbox/trunk/ConfigurableFeatureExtractor/src/docbook/CFE_UG.xml Fri Jun 4 10:27:30 2010
@@ -90,7 +90,7 @@
<para>
<inlinemediaobject>
<imageobject>
- <imagedata fileref="&imgroot;/CFE_UG-1.jpg" />
+ <imagedata fileref="&imgroot;/CFE_UG-1.jpg" width="93%"/>
</imageobject>
</inlinemediaobject>
</para>
@@ -1024,7 +1024,7 @@
<para>
<inlinemediaobject>
<imageobject>
- <imagedata fileref="&imgroot;/CFE_UG-14.jpg" align="center"/>
+ <imagedata fileref="&imgroot;/CFE_UG-14.jpg" align="center" width="93%"/>
</imageobject>
</inlinemediaobject>
</para>
Modified: uima/sandbox/trunk/Tagger/src/docbook/hmmTaggerUsersGuide.xml
URL: http://svn.apache.org/viewvc/uima/sandbox/trunk/Tagger/src/docbook/hmmTaggerUsersGuide.xml?rev=951352&r1=951351&r2=951352&view=diff
==============================================================================
--- uima/sandbox/trunk/Tagger/src/docbook/hmmTaggerUsersGuide.xml (original)
+++ uima/sandbox/trunk/Tagger/src/docbook/hmmTaggerUsersGuide.xml Fri Jun 4 10:27:30 2010
@@ -161,7 +161,9 @@
<fileResourceSpecifier>
<fileUrl>file:german/TuebaModel.dat</fileUrl>
</fileResourceSpecifier>
- <implementationName>org.apache.uima.examples.tagger.ModelResource</implementationName>
+ <implementationName>
+ org.apache.uima.examples.tagger.ModelResource
+ </implementationName>
</externalResource>
</externalResources>]]></emphasis></programlisting>
@@ -190,7 +192,9 @@ After these two parameters have been set
<capability>
<inputs>
<type>org.apache.uima.TokenAnnotation</type>
- <type allAnnotatorFeatures="true">org.apache.uima.SentenceAnnotation</type>
+ <type allAnnotatorFeatures="true">
+ org.apache.uima.SentenceAnnotation
+ </type>
<feature>org.apache.uima.TokenAnnotation:end</feature>
<feature>org.apache.uima.TokenAnnotation:begin</feature>
</inputs>
@@ -218,7 +222,8 @@ After these two parameters have been set
The used model is:resources/german/TuebaModel.dat
61646 distinct words in the model
Number of part-of-speech tags used: 54
-These are: [$(, $,, $., ADJA, ADJD, ADV, APPO, APPR, APPRART, APZR, ART, CARD, ... ]
+These are: [$(, $,, $., ADJA, ADJD, ADV, APPO,
+ APPR, APPRART, APZR, ART, CARD, ... ]
Testing German trigram tagger..
[Jerry, liebt, Wansley, .]
expected: [NE, VVFIN, NE, $.]
@@ -229,7 +234,8 @@ Tesing English Model...
The used model is:resources/english/BrownModel.dat
56012 distinct words in the model
Number of part-of-speech tags used: 473
-These are: [', '', (, ), *, ,, --, ., :, ``, abl, abn, abx, ap, ap$, at, be, bed, ...]
+These are: [', '', (, ), *, ,, --, ., :, ``, abl,
+ abn, abx, ap, ap$, at, be, bed, ...]
Testing English trigram tagger...
[Jerry, loves, Wansley, .]
expected: [np, vbz, np, .]
@@ -371,13 +377,15 @@ Very Good!]]></programlisting>
<code>ModelGeneration.java</code>
<programlisting id="properties.file" xreflabel="tagger.properties file"><emphasis><![CDATA[######## This is the default tagger.properties file
######## This file is used for training and testing only,
-######## The configuration for tagging is directly tuned in the descriptor "HmmTagger.xml"
+######## The configuration for tagging is directly
+######## tuned in the descriptor "HmmTagger.xml"
-########################## BOTH FOR TRAINING AND EVALUATION ################################
+########################## BOTH FOR TRAINING AND EVALUATION ########
######## THESE ARE THE DEFAULT MODEL FILES FOR GERMAN AND ENGLISH
-######## You can either uncomment one of them, if you want to replace given models with your own one,
+######## You can either uncomment one of them, if you want to replace
+######## given models with your own one,
#MODEL_FILE = resources/german/TuebaModel.dat
#MODEL_FILE = resources/english/BrownModel.dat
@@ -391,12 +399,15 @@ MODEL_FILE =
####### EXAMPLES OF MAPPING CLASSES
-## Basic mapping for the Brown corpus (nltk distribution) tagset: to get 93 tags out of 473
+## Basic mapping for the Brown corpus (nltk distribution) tagset:
+## to get 93 tags out of 473
#MAPPING = org.apache.uima.examples.tagger.TagMappingBrown
-## Basic mapping for STTS tagset: from 54 tags onto the basic ca. 15 classes plus punctuation
+## Basic mapping for STTS tagset: from 54 tags onto the basic
+## ca. 15 classes plus punctuation
#MAPPING = org.apache.uima.examples.tagger.GrobMappingTueba
-## If you implement your own mapping, you should specify here in the same manner as above a java-path to the class
+## If you implement your own mapping, you should specify here in
+## the same manner as above a java-path to the class
MAPPING =
####### FILE CONTAINING TRAINING CORPUS:
@@ -404,21 +415,25 @@ MAPPING =
####### e.g. FILE = ../../tueba_tigerFormat.txt or FILE = C:/Data/tueba.txt
FILE =
-######## If corpus is in a different format and cannot be read with the provided READERS,
-######## you should specify here a java-path to the class (s. examples below)
+######## If corpus is in a different format and
+######## cannot be read with the provided READERS,
+######## you should specify here a java-path to the
+######## class (s. examples below)
-#CORPUS_READER = org.apache.uima.examples.tagger.trainAndTest.TT_FormatReader
-#CORPUS_READER = org.apache.uima.examples.tagger.trainAndTest.BrownReader
+#CORPUS_READER=org.apache.uima.examples.tagger.trainAndTest.TT_FormatReader
+#CORPUS_READER=org.apache.uima.examples.tagger.trainAndTest.BrownReader
CORPUS_READER =
-################# ONLY FOR EVALUATION ###############################
+################# ONLY FOR EVALUATION ######################
######### GOLD STANDARD CORPUS FILE:
######### can be specified as an absolute or as a relative path
-## e.g. GOLD_STANDARD = ../../tueba_tigerFormat.txt or GOLD_STANDARD = C:/Data/tueba.txt
+## e.g. GOLD_STANDARD = ../../tueba_tigerFormat.txt or
+## GOLD_STANDARD = C:/Data/tueba.txt
GOLD_STANDARD =
-######### Here we specify whether one intends to test a bi- or a trigram model (default is a trigram model)
+######### Here we specify whether one intends to test a bi- or a
+######### trigram model (default is a trigram model)
N=3
]]></emphasis>
</programlisting>