You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Marshall Schor (Jira)" <de...@uima.apache.org> on 2019/11/22 21:40:00 UTC

[jira] [Commented] (UIMA-6152) "trim" method for AnnotationFS

    [ https://issues.apache.org/jira/browse/UIMA-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980528#comment-16980528 ] 

Marshall Schor commented on UIMA-6152:
--------------------------------------

Wondering what makes best sense for "whitespace"?  Some possibilities:
 * Character.isWhitespace(some_string.charAt(c)) - handles lots of variants, but doesn't handle supplementary chars.
 * Character.isWhitespace(some_string.codePointAt(some_offset)) - handles also the supplementary chars

I'm thinking the 2nd would be best?

 

> "trim" method for AnnotationFS
> ------------------------------
>
>                 Key: UIMA-6152
>                 URL: https://issues.apache.org/jira/browse/UIMA-6152
>             Project: UIMA
>          Issue Type: New Feature
>          Components: UIMA
>            Reporter: Richard Eckart de Castilho
>            Priority: Minor
>
> Usually leading or trailing whitespace is not helpful in text annotations. Thus, it would be nice to have a `trim()` convenience method on AnnotationFS which increases/decreases begin/end such that leading/trailing whitespace is removed from the annotation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)