You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Jerry Miernik <jm...@cisco.com> on 2005/02/25 20:05:34 UTC
Parse/read an XML doc with encoding Shift_JIS adds characters (2nd email)
I am using a GetDictionary.jsp script to parse an XML document
containing simple Japanese phrases encoded with Shift_JIS.
What I observe is that after reading a phrase into a
java.util.HashMap bean the phrase becomes longer and modified
(really, corrupted). This (corrupted) phrase sent in an HTTP reply,
is modified again inside the packet.
Example:
1. the original phrase (in binary), from the XML doc:
be dd c0 b8 bc c3 b8 c0 de bb b2 2e 2e 2e
2. the phrase (1) logged from a HashMap object:
efbf bd efbf bd 38 efbf bd c3 b8 efbf bd de bb efbf bd 2e 2e 2e
3. the phrase (2) in an HTTP reply:
21 29 21 29 38 21 29 21 29 21 29 21 29 21 29 2e 2e 2e
It appears as if two transformations were going on, beyond the
GetDictionary.jsp control. Interestingly the last three characters
2e 2e 2e (='...') are preserved all the way.
Another script Send.jsp that sends a reply has the following
page declarations:
<%@ page pageEncoding="Shift_JIS"%>
<%@ page contentType="text/xml; charset=Shift_JIS"%>
The situation is the same bad if only one declaration is in
the Send.jsp:
<%@ page contentType="text/xml; charset=Shift_JIS"%>
Does anyone have a suggestion what am I doing wrong?
Especially - while is there a 'transformation' from the XML doc
to the HashMap?
And then, why is there a second 'transformation' into a packet?
Thanks,
Jerry.
[By the way, the same GetDictionary.jsp works fine if the
parsed/read XML document is utf-8 encoded, or ISO-8859-1 encoded,
or ascii. The phrase in the HashMap object is binary identical with
the original phrase from the XML doc. Then it is the same in the
reply packet (sent with a script having declared a corresponding
charset).]
---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org