You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Jeff Zemerick (JIRA)" <ji...@apache.org> on 2018/04/03 22:15:00 UTC

[jira] [Updated] (OPENNLP-1185) Tokenizers should be able to output a new line token

     [ https://issues.apache.org/jira/browse/OPENNLP-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Zemerick updated OPENNLP-1185:
-----------------------------------
    Labels: ctakes  (was: )

> Tokenizers should be able to output a new line token
> ----------------------------------------------------
>
>                 Key: OPENNLP-1185
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1185
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Tokenizer
>            Reporter: Joern Kottmann
>            Assignee: Peter Thygesen
>            Priority: Major
>              Labels: ctakes
>
> Some use cases need the tokenizers to also output new line tokens. This is needed e.g. by cTakes to process clinical notes, or by the name finder to process list of names where each name is written in one line. Also it helps the name finder to process news articles.
> To fix this issue add an option to all three tokenizers to emit new line tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)