You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Joern Kottmann (JIRA)" <ji...@apache.org> on 2015/01/20 21:41:34 UTC

[jira] [Updated] (OPENNLP-715) Clark clusters NameFinder features

     [ https://issues.apache.org/jira/browse/OPENNLP-715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joern Kottmann updated OPENNLP-715:
-----------------------------------
    Fix Version/s:     (was: 1.6.0)
                   1.6.1

> Clark clusters NameFinder features
> ----------------------------------
>
>                 Key: OPENNLP-715
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-715
>             Project: OpenNLP
>          Issue Type: New Feature
>          Components: Name Finder
>    Affects Versions: 1.6.0
>            Reporter: Rodrigo Agerri
>            Assignee: Rodrigo Agerri
>            Priority: Minor
>             Fix For: 1.6.1
>
>
> Add token based features from Clark clusters (Clark 2003). This feature is actually the same as the one implemented in the WordClusterFeatureGenerator, but we should somehow make them separate (perhaps implementing a dynamic prefix id for each one, as in the dictionary features) as it has been shown that the combination of these clustering-based features improve results. 
> Clark clusters can be generated using this tool: 
> https://github.com/ninjin/clark_pos_induction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)