You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Martin Toepfer (JIRA)" <de...@uima.apache.org> on 2013/02/21 13:10:13 UTC

[jira] [Updated] (UIMA-2536) TextMarker annotator for html to plain text conversion

     [ https://issues.apache.org/jira/browse/UIMA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martin Toepfer updated UIMA-2536:
---------------------------------

    Attachment: UIMA-2536.patch

Configurable Html Converter AE with tests and documentation.
                
> TextMarker annotator for html to plain text conversion
> ------------------------------------------------------
>
>                 Key: UIMA-2536
>                 URL: https://issues.apache.org/jira/browse/UIMA-2536
>             Project: UIMA
>          Issue Type: New Feature
>          Components: TextMarker
>    Affects Versions: 2.0.0TextMarker
>            Reporter: Peter Klügl
>            Assignee: Peter Klügl
>         Attachments: UIMA-2536.patch
>
>
> The broken conversion functionality was removed in UIMA-2524. Add an additional analysis engine that is able to strip html tags but retains all annotations with begin<end after the offsets were adapted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira