You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Joern Kottmann (JIRA)" <ji...@apache.org> on 2014/03/06 11:10:43 UTC

[jira] [Commented] (OPENNLP-193) Update POS Tagger cmd line trainer tool to use new xml tag dict format

    [ https://issues.apache.org/jira/browse/OPENNLP-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922252#comment-13922252 ] 

Joern Kottmann commented on OPENNLP-193:
----------------------------------------

William, was is your opinion. Should we make the xml dictionary mandatory, or should we stick to the old format?

The XML dictionary has the advantage that there can't be any encoding issues.
If we stick to the old format, we could specify that the dictionary has to be UTF-8.

> Update POS Tagger cmd line trainer tool to use new xml tag dict format
> ----------------------------------------------------------------------
>
>                 Key: OPENNLP-193
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-193
>             Project: OpenNLP
>          Issue Type: Task
>          Components: Command Line Interface, POS Tagger
>            Reporter: Joern Kottmann
>            Priority: Minor
>             Fix For: 1.6.0
>
>
> The POS Tagger trainer cmd line tool uses still the old tag dict format for backward compatibility reasons. The format was replaced by a new xml based dictionary.
> Update the POS Tagger trainer tool to only use the new xml based dictionary format.



--
This message was sent by Atlassian JIRA
(v6.2#6252)