You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Koji Sekiguchi (JIRA)" <ji...@apache.org> on 2017/12/01 03:32:00 UTC

[jira] [Commented] (OPENNLP-1161) avoid using concrete tag names of XML config in GeneratorFactory.extractArtifactSerializerMappings()

    [ https://issues.apache.org/jira/browse/OPENNLP-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273875#comment-16273875 ] 

Koji Sekiguchi commented on OPENNLP-1161:
-----------------------------------------

This is a blocker of OPENNLP-1154 because in OPENNLP-1154, I try to change the XML format from classic to new one. And the current implementation in GeneratorFactory.extractArtifactSerializerMappings() depends on the classic format.

> avoid using concrete tag names of XML config in GeneratorFactory.extractArtifactSerializerMappings()
> ----------------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-1161
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1161
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Formats, Name Finder
>    Affects Versions: 1.8.3
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Blocker
>
> When working on OPENNLP-1154, I noticed this.
> In GeneratorFactory.extractArtifactSerializerMappings(), it specifies concrete XML tag names:
> {code:java}
>     for (int i = 0; i < allElements.getLength(); i++) {
>       if (allElements.item(i) instanceof Element) {
>         Element xmlElement = (Element) allElements.item(i);
>         String dictName = xmlElement.getAttribute("dict");
>         if (dictName != null) {
>           switch (xmlElement.getTagName()) {
>             case "wordcluster":
>               mapping.put(dictName, new WordClusterDictionary.WordClusterDictionarySerializer());
>               break;
>             case "brownclustertoken":
>               mapping.put(dictName, new BrownCluster.BrownClusterSerializer());
>               break;
>             case "brownclustertokenclass"://, ;
>               mapping.put(dictName, new BrownCluster.BrownClusterSerializer());
>               break;
>             case "brownclusterbigram": //, ;
>               mapping.put(dictName, new BrownCluster.BrownClusterSerializer());
>               break;
>             case "dictionary":
>               mapping.put(dictName, new DictionarySerializer());
>               break;
>           }
>         }
>         String modelName = xmlElement.getAttribute("model");
>         if (modelName != null) {
>           switch (xmlElement.getTagName()) {
>             case "tokenpos":
>               mapping.put(modelName, new POSModelSerializer());
>               break;
>           }
>         }
>       }
>     }
> {code}
> Instead, we'd better let FeatureGeneratorFactories implement a method that returns mapping (Map<String, ArtifactSerializer<?>>) and in GeneratorFactory.extractArtifactSerializerMappings(), the framework just calls the method of FeatureGeneratorFactories, which are found in XML config.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)