You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@camel.apache.org by jonathanq <jq...@abebooks.com> on 2011/01/24 22:33:05 UTC
Re: XStream and forcing ISO-8859-1 Encoding
I am sorry to bring this back from the dead. However I was just trying out
the unmarshal().xstream("ISO-8859-1") method introduced because of this
thread. Unfortunately it still does not solve the problem (as of Camel
2.5.0)
>From non-camel routes, we have been publishing JMS messages and serializing
the message to XML as follows:
XStream xstream = new XStream(new DomDriver("ISO-8859-1"));
String messageXml = xstream.toXml(someObject);
Then using a producerTemplate to publish it to our messaging system.
When we used a route (like):
from(someIncomingEndpoint)
.unmarshal().xstream("ISO-8859-1")
.process(myUpdateProcessor);
Our processor received a deserialized message - but the content was not
correct. It took strings that were serialized as ISO-8859-1 and then it
deserialized it as UTF-8.
I modified our route to introduce a new Processor (instead of the in-line
unmashal) that did the following:
String messageBody = exchange.getIn().getBody(String.class);
XStream xstream = new XStream(new DomDriver("ISO-8859-1"));
Object myObject = xstream.fromXml(messageBody );
exchange.getIn().setBody(myObject);
This works fine, the text our process receives is correct ISO-8859-1 and
nothing is garbled.
I set a breakpoint and stepped through the camel code with the in-line
unmarshal. It does pass down the encoding specified (ISO-8859-1). However
it constructs the XStream object using the default XppDriver (which you
can't specify an encoding on).
According to the XStream documentation - the XppDriver (and others not
including DomDriver) rely on the underlying InputStream/OutputStream passed
to the XStream object to determine the encoding.
I found in this method of AbstractXStreamWrapper.java:
public Object unmarshal(Exchange exchange, InputStream stream) throws
Exception {
HierarchicalStreamReader reader =
createHierarchicalStreamReader(exchange, stream);
try {
return
getXStream(exchange.getContext().getClassResolver()).unmarshal(reader);
} finally {
reader.close();
}
}
The "HierarchicalStreamReader " that is created is of type:
com.thoughtworks.xstream.io.xml.StaxReader
When I stepped in to the "unmarshal" method the XStream class - I saw that
the reader passed in (the same StaxReader) has a property called "in" that
was of type: com.ctc.wstx.sr.ValidatingStreamReader
This, in turn, had 2 properties:
mDocInputEncoding = {java.lang.String@4784}"ISO-8859-1"
mDocXmlEncoding = {java.lang.String@4785}"UTF-8"
While I can't say that this is why the text is coming out as UTF-8 - but it
does seem suspicious that although the InputEncoding is set to ISO-8859-1,
the XmlEncoding is still "UTF-8".
In any event - for our own purposes we have created 2 Processor classes to
serialize/deserialize our XML. We can't rely on the unmarshal/marshal
methods when it comes to encoding and our XML.
Just wanted to pass along the news that the fix doesn't seem to have solved
the problem.
--
View this message in context: http://camel.465427.n5.nabble.com/XStream-and-forcing-ISO-8859-1-Encoding-tp478220p3355313.html
Sent from the Camel - Users mailing list archive at Nabble.com.
Re: XStream and forcing ISO-8859-1 Encoding
Posted by Claus Ibsen <cl...@gmail.com>.
You can open a ticket in JIRA
http://camel.apache.org/support.html
If possible then a test case which demonstrates your issue is a great
start. That can be used to track down the issue and help solving it.
You are welcome to dig into the source code and provide a patch.
On Mon, Jan 24, 2011 at 10:33 PM, jonathanq <jq...@abebooks.com> wrote:
>
> I am sorry to bring this back from the dead. However I was just trying out
> the unmarshal().xstream("ISO-8859-1") method introduced because of this
> thread. Unfortunately it still does not solve the problem (as of Camel
> 2.5.0)
>
> From non-camel routes, we have been publishing JMS messages and serializing
> the message to XML as follows:
>
> XStream xstream = new XStream(new DomDriver("ISO-8859-1"));
> String messageXml = xstream.toXml(someObject);
>
> Then using a producerTemplate to publish it to our messaging system.
>
> When we used a route (like):
>
> from(someIncomingEndpoint)
> .unmarshal().xstream("ISO-8859-1")
> .process(myUpdateProcessor);
>
> Our processor received a deserialized message - but the content was not
> correct. It took strings that were serialized as ISO-8859-1 and then it
> deserialized it as UTF-8.
>
> I modified our route to introduce a new Processor (instead of the in-line
> unmashal) that did the following:
> String messageBody = exchange.getIn().getBody(String.class);
> XStream xstream = new XStream(new DomDriver("ISO-8859-1"));
> Object myObject = xstream.fromXml(messageBody );
> exchange.getIn().setBody(myObject);
>
> This works fine, the text our process receives is correct ISO-8859-1 and
> nothing is garbled.
>
> I set a breakpoint and stepped through the camel code with the in-line
> unmarshal. It does pass down the encoding specified (ISO-8859-1). However
> it constructs the XStream object using the default XppDriver (which you
> can't specify an encoding on).
>
> According to the XStream documentation - the XppDriver (and others not
> including DomDriver) rely on the underlying InputStream/OutputStream passed
> to the XStream object to determine the encoding.
>
> I found in this method of AbstractXStreamWrapper.java:
>
> public Object unmarshal(Exchange exchange, InputStream stream) throws
> Exception {
> HierarchicalStreamReader reader =
> createHierarchicalStreamReader(exchange, stream);
> try {
> return
> getXStream(exchange.getContext().getClassResolver()).unmarshal(reader);
> } finally {
> reader.close();
> }
> }
>
> The "HierarchicalStreamReader " that is created is of type:
> com.thoughtworks.xstream.io.xml.StaxReader
>
> When I stepped in to the "unmarshal" method the XStream class - I saw that
> the reader passed in (the same StaxReader) has a property called "in" that
> was of type: com.ctc.wstx.sr.ValidatingStreamReader
>
> This, in turn, had 2 properties:
>
> mDocInputEncoding = {java.lang.String@4784}"ISO-8859-1"
> mDocXmlEncoding = {java.lang.String@4785}"UTF-8"
>
> While I can't say that this is why the text is coming out as UTF-8 - but it
> does seem suspicious that although the InputEncoding is set to ISO-8859-1,
> the XmlEncoding is still "UTF-8".
>
>
> In any event - for our own purposes we have created 2 Processor classes to
> serialize/deserialize our XML. We can't rely on the unmarshal/marshal
> methods when it comes to encoding and our XML.
>
> Just wanted to pass along the news that the fix doesn't seem to have solved
> the problem.
>
> --
> View this message in context: http://camel.465427.n5.nabble.com/XStream-and-forcing-ISO-8859-1-Encoding-tp478220p3355313.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>
--
Claus Ibsen
-----------------
FuseSource
Email: cibsen@fusesource.com
Web: http://fusesource.com
Twitter: davsclaus
Blog: http://davsclaus.blogspot.com/
Author of Camel in Action: http://www.manning.com/ibsen/