You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@camel.apache.org by "Sergey Sidashov (JIRA)" <ji...@apache.org> on 2015/06/02 07:03:17 UTC

[jira] [Commented] (CAMEL-8356) IOConverter.toInputStream(file, charset) returns strange behaving stream

    [ https://issues.apache.org/jira/browse/CAMEL-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568526#comment-14568526 ] 

Sergey Sidashov commented on CAMEL-8356:
----------------------------------------

It seems encoding problem with IOConverter still exists. I try to load text file in cp1251 encoding, using file component (uri=file:C:\addr\in\?charset=cp1251 for example). Then I write bean with method:

public static String convertStreamToString(InputStream inputStream) throws IOException {
        if (inputStream == null) return null;
        StringBuilder sb = new StringBuilder(2048); // Define a size if you have an idea of it.
        char[] read = new char[128]; // Your buffer size.
        try (InputStreamReader ir = new InputStreamReader(inputStream, "cp1251")) {
            for (int i; -1 != (i = ir.read(read)); sb.append(read, 0, i));
        } catch (Throwable t) {}
        return sb.toString();
    }
to test conversion from File to InputStream. This stream for some files reads all content successfully, but for some files it clips contents of file. It seems file reading ends with some characters (for example, in cp1251 encoding, file reading ends with characters 'яя'). Camel version 2.15.2, java version 1.8.0_45.

> IOConverter.toInputStream(file, charset) returns strange behaving stream
> ------------------------------------------------------------------------
>
>                 Key: CAMEL-8356
>                 URL: https://issues.apache.org/jira/browse/CAMEL-8356
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-core
>    Affects Versions: 2.14.1, 2.15.0
>            Reporter: Stefan Mandel
>            Assignee: Willem Jiang
>             Fix For: 2.14.2, 2.15.0
>
>         Attachments: CAMEL8356-repaired-Test-and-adjusted-converter-imple.patch, IOConverterCharsetTest.java, german.iso-8859-1.txt, german.utf-8.txt
>
>
> Calling IOConverter.toInputStream with either UTF-8 or ISO-8859-1 returns a stream that behaves strange on non-ascii-characters:
> - putting this stream into an InputStreamReader will return false encoded characters
> - a naive new BufferedReader(new InputStreamReader(new FileInputStream(file), charset)) will return the correctly encoded characters.
> I will attach some unit tests for this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)