You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Peter Klügl (JIRA)" <de...@uima.apache.org> on 2013/02/22 11:18:12 UTC

[jira] [Commented] (UIMA-2536) TextMarker annotator for html to plain text conversion

    [ https://issues.apache.org/jira/browse/UIMA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584140#comment-13584140 ] 

Peter Klügl commented on UIMA-2536:
-----------------------------------

Applied patch with some small modifications: removed author tag and replaced String.isEmpty() (not java 1.5)
                
> TextMarker annotator for html to plain text conversion
> ------------------------------------------------------
>
>                 Key: UIMA-2536
>                 URL: https://issues.apache.org/jira/browse/UIMA-2536
>             Project: UIMA
>          Issue Type: New Feature
>          Components: TextMarker
>    Affects Versions: 2.0.0TextMarker
>            Reporter: Peter Klügl
>            Assignee: Peter Klügl
>         Attachments: UIMA-2536.patch
>
>
> The broken conversion functionality was removed in UIMA-2524. Add an additional analysis engine that is able to strip html tags but retains all annotations with begin<end after the offsets were adapted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira