You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Peter Klügl (JIRA)" <de...@uima.apache.org> on 2012/07/09 15:10:34 UTC

[jira] [Commented] (UIMA-2359) Different results of Text Maker in windows and unix

    [ https://issues.apache.org/jira/browse/UIMA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409428#comment-13409428 ] 

Peter Klügl commented on UIMA-2359:
-----------------------------------

Is there a generic solution for this problem? I would not restrict the functionality to either of both cases. Should only one break be created by the lexer? In my applications, I solved this on the rule-level, but I am open to any suggestions and improvements.
                
> Different results of Text Maker in windows and unix
> ---------------------------------------------------
>
>                 Key: UIMA-2359
>                 URL: https://issues.apache.org/jira/browse/UIMA-2359
>             Project: UIMA
>          Issue Type: Bug
>          Components: Sandbox, TextMarker
>    Affects Versions: build-resources-2
>         Environment: Windows
>            Reporter: Luca Dini (CELI)
>            Assignee: Peter Klügl
>            Priority: Minor
>              Labels: patch
>
> The class AbstractApplyScriptHandlerJob when called from the workbenck calls, for reding text to be analyzed the method:
>  org.apache.uima.pear.util.FileUtil.loadTextFile(new File(each), "UTF-8");
> Such a method return nelines in window as 2 new lines. Therefore basic TextMarker annotations appears like:
> line BREAK BREAK
> line BREAK BREAK
> Therefore grammars written on windows must take into account the double break which make them not applicable when running on unix or when using other read methods, such as:
>     		Scanner sc = new Scanner(inFile, "UTF-8");
>     		String out = "";
>     		while (sc.hasNextLine()) {
>     			out += sc.nextLine() + "\n";
>     		}
> Relates to:
> https://issues.apache.org/jira/browse/UIMA-2133t

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira