You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@camel.apache.org by "Sergey Smith (JIRA)" <ji...@apache.org> on 2019/03/27 17:05:00 UTC

[jira] [Created] (CAMEL-13374) XMLTokenExpressionIterator Default Exchange charset overrides original xml encoding from InputStream

Sergey Smith created CAMEL-13374:
------------------------------------

             Summary: XMLTokenExpressionIterator Default Exchange charset overrides original xml encoding from InputStream
                 Key: CAMEL-13374
                 URL: https://issues.apache.org/jira/browse/CAMEL-13374
             Project: Camel
          Issue Type: Bug
          Components: camel-core
    Affects Versions: 2.22.0, 2.18.0
            Reporter: Sergey Smith


Default Exchange charset overrides original xml encoding from InputStream

at

org.apache.camel.support.XMLTokenExpressionIterator.doEvaluate(Exchange exchange, boolean closeStream)

_String charset = IOHelper.getCharsetName(exchange);_

must be replaced with

_String charset = IOHelper.getCharsetName(exchange, *false*);_

then at 

_// woodstox's getLocation().etCharOffset() does not return the offset correctly for InputStream, so use Reader instead._
_this(path, nsmap, mode, group, new *InputStreamReader*(in, charset));_

_and_ 

_// woodstox's getLocation().etCharOffset() does not return the offset correctly for InputStream, so use Reader instead._
_this(path, nsmap, mode, 1, new *InputStreamReader*(in, charset));_

lines use 

org.apache.commons.io.input.XmlStreamReader instead of just InputStreamReader

it correctly determinants encoding from xml header when it present.

 

Examle document at InputStream body:

_<?xml version = "1.0" encoding= "ISO-8859-5" standalone="no" ?>_

_<xml/>_

Current _charset_ result: is UTF-8 (*default* from _IOHelper.getCharsetName(exchange)_)

Expected result: _ISO-8859-5_

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)