You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "William Colen (Commented) (JIRA)" <ji...@apache.org> on 2012/03/18 16:30:41 UTC

[jira] [Commented] (OPENNLP-479) Features related to abbreviation dictionary are not properly collected by DefaultSDContextGenerator

    [ https://issues.apache.org/jira/browse/OPENNLP-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232286#comment-13232286 ] 

William Colen commented on OPENNLP-479:
---------------------------------------

I changed the DefaultSDContextGenerator assuming that the correct is to have abbreviations with the form "mr.". Please review.
                
> Features related to abbreviation dictionary are not properly collected by DefaultSDContextGenerator
> ---------------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-479
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-479
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Sentence Detector
>    Affects Versions: tools-1.5.3
>            Reporter: William Colen
>            Assignee: William Colen
>             Fix For: tools-1.5.3
>
>
> The documentation is not clear about if the entries in abbreviation dictionary should include the EOS character. For example "mr" or "mr.". Also, part of the collector code expects the dictionary to include the EOS character, and others don't.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira