You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Jeff Zemerick (Jira)" <ji...@apache.org> on 2022/06/07 11:58:00 UTC

[jira] [Updated] (OPENNLP-1306) NameSample overlap exception not helpful

     [ https://issues.apache.org/jira/browse/OPENNLP-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Zemerick updated OPENNLP-1306:
-----------------------------------
    Fix Version/s: 2.0.0
                       (was: 1.9.5)

> NameSample overlap exception not helpful
> ----------------------------------------
>
>                 Key: OPENNLP-1306
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1306
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Name Finder
>    Affects Versions: 1.9.2
>            Reporter: Markus Jelsma
>            Priority: Major
>             Fix For: 2.0.0
>
>         Attachments: OPENNLP-1306.patch
>
>
> I got this for some very large training file.
> {code:java}
>          Computing event counts...  Exception in thread "main" java.lang.RuntimeException: name spans [27..29) person and [27..27) person are overlapped in file: null
>         at opennlp.tools.namefind.NameSample.<init>(NameSample.java:79)
>         at opennlp.tools.namefind.NameSample.<init>(NameSample.java:97)
>         at opennlp.tools.namefind.NameSample.<init>(NameSample.java:101)
> {code}
> With this exception it is impossible to track the error if you have a large training file.
>  
> Exceptions about mismatching <START:> and <END> tags at least give a little bit of context. This patch adds the sentence parts to the exception, making it simple to grep the training file for the bad sentence.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)