You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Philip-Daniel Beck (JIRA)" <de...@uima.apache.org> on 2013/12/20 11:24:10 UTC

[jira] [Updated] (UIMA-3512) Add additional engine parameter for Ruta HtmlConverter to configure linebreak replacement.

     [ https://issues.apache.org/jira/browse/UIMA-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Philip-Daniel Beck updated UIMA-3512:
-------------------------------------

    Attachment: linebreakReplacementEngineParameter.docbook_patch
                linebreakReplacementEngineParameter.core_patch

Patch for the related issue in Java and in Docbook code. An additional configuration parameter "linebreakReplacement" is added to engine HtmlConverter.

> Add additional engine parameter for Ruta HtmlConverter to configure linebreak replacement.
> ------------------------------------------------------------------------------------------
>
>                 Key: UIMA-3512
>                 URL: https://issues.apache.org/jira/browse/UIMA-3512
>             Project: UIMA
>          Issue Type: Improvement
>          Components: ruta
>    Affects Versions: 2.1.1ruta
>            Reporter: Philip-Daniel Beck
>             Fix For: 2.1.1ruta
>
>         Attachments: linebreakReplacementEngineParameter.core_patch, linebreakReplacementEngineParameter.docbook_patch
>
>
> When converting an HTML file to plain text with HtmlConverter engine in Ruta, there exists an engine parameter "replaceLinebreaks" of type boolean to decide if text linebreaks should be replaced or not. If set to true, all linebreaks are kept in the document. If set to false, all linebreaks are deleted. Therefore, the last word of a line and the first word of the next line are put together without whitespace in between. It would often be better if a linebreak is replaced by a whitespace. To configure this, another engine parameter that defines the String, the linebreak is replaced with, would be useful.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)