You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Łukasz Dróżdż (JIRA)" <ji...@apache.org> on 2015/04/16 17:42:59 UTC
[jira] [Commented] (OPENNLP-287) Extend POS Tagger documentation
with more information about the tag dictionary
[ https://issues.apache.org/jira/browse/OPENNLP-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498191#comment-14498191 ]
Łukasz Dróżdż commented on OPENNLP-287:
---------------------------------------
Hi,
Here's my attempt at providing a sample POS dictionary file, as well as test code for programmatic usage, both in reading in and writing back the dictionary and using it to training a POS tagger. See the attached files for details.
The XML structure of a POS dictionary is:
<?xml version="1.0" encoding="UTF-8"?>
<dictionary>
<entry tags="tag1 tag2">
<token>token1</token>
</entry>
<entry tags="tag1">
<token>token2</token>
</entry>
</dictionary>
Hope that helps.
> Extend POS Tagger documentation with more information about the tag dictionary
> ------------------------------------------------------------------------------
>
> Key: OPENNLP-287
> URL: https://issues.apache.org/jira/browse/OPENNLP-287
> Project: OpenNLP
> Issue Type: Improvement
> Components: Documentation, POS Tagger
> Reporter: Joern Kottmann
> Priority: Minor
> Attachments: TaggerDictionaryTest.java, dictionary.xml, en-pos.train
>
>
> Extend the POS Tagger tag dictionary section as described in the documentation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)