You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-dev@axis.apache.org by Cédric Chabanois <CC...@natsystem.fr> on 2003/11/26 16:53:04 UTC

RE: bug #24896 : I don't understand what we are doing in Abstract XMLE ncoder

I tried EchoHeaders using the following string "Une chaîne avec des
caractères accentués et du japonais : \u4eca\u65e5\u306f\u4e16\u754c" (which
means "A string with accent characters and japanese : hello world"(hello
world in japanese)
The string in request and response (captured using TcpTrace) was "Une
chaÃ®ne avec des caractÃ¨res accentuÃ©s et du japonais : ä»Sæ-¥ã¯ä¸-ç*OE"
I pasted this as UTF-8 in a unicode editor (Unipad) and I got my french
string followed by the japanese characters as I expected.

Not a complete test but it seems to work.

Cédric


> -----Message d'origine-----
> De : Davanum Srinivas [mailto:dims@yahoo.com]
> Envoyé : mercredi 26 novembre 2003 16:52
> À : axis-dev@ws.apache.org
> Objet : Re: bug #24896 : I don't understand what we are doing in
> AbstractXMLE ncoder
> 
> 
> That's what am trying to figure out as well :) Right now, am 
> writing more test cases against the
> EchoHeaders.jws just to be sure we don't break anything.
> 
> -- dims
> 
> --- Cédric_Chabanois <CC...@natsystem.fr> wrote:
> > Hi all,
> > 
> > My correction for bug #24896 worked ie xml sent is in UTF-8 
> format (before
> > french accents, chinese characters ... were not transmitted 
> correctly) but I
> > don't really understand what we are doing In AbstractXMLEncoder and
> > UTF8Encoder :
> > encode method takes a java String.
> > This string is converted to a byte array in UTF-8 (using
> > String.getBytes("UTF-8")) and
> > & becomes "&amp"
> > " becomes "&quot"
> > < becomes "&lt"
> > > becomes "&gt"
> > all other characters are encoded using UTF-8 (appendEncoded 
> method in
> > UTF8Encoder).
> > 
> > Then the characters are converted back to a string (using 
> UTF-8 charset
> > since my patch and using platform's default charset before 
> my patch : the
> > bytes were not valid for the default charset)
> > 
> > I wonder why we use an UTF-8 byte array there just to 
> reconvert it to a
> > string after since all we do is to convert some characters 
> (& -> &amp ...).
> > 
> > There is probably something I missed somewhere ...
> > 
> > Cédric
> 
> 
> =====
> Davanum Srinivas - http://webservices.apache.org/~dims/
>