You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cxf.apache.org by Brad Harper <br...@gmail.com> on 2007/09/19 12:55:05 UTC

UTF-8 vs. utf-8 causes different response..

We're doing testing from different soap clients and have narrowed down a
discrepancy in our responses based on sending in "charset=utf-8" vs
"charset=UTF-8".  The native PHP soap client passes in the former and the
namespacing on the child nodes is missing when the lowercase charset is
passed in.  Any ideas?

A semi-hacky solution is adding a toUpperCase() to the character encoding in
ServletController:228...

            inMessage.put(Message.ENCODING, enc.toUpperCase());

Figured you guys could tell me if this could cause any adverse results...


Regards,
Brad

Re: UTF-8 vs. utf-8 causes different response..

Posted by Daniel Kulp <dk...@apache.org>.
It's probably better to do the "encoding" changes whereever we create the 
XML parser.     Probably in the "StaxInInterceptor.java".   That way, if 
some other transport is uses and the same encoding issue happens there, 
the same fix would apply.

Dan


On Wednesday 19 September 2007, Brad Harper wrote:
> We're doing testing from different soap clients and have narrowed down
> a discrepancy in our responses based on sending in "charset=utf-8" vs
> "charset=UTF-8".  The native PHP soap client passes in the former and
> the namespacing on the child nodes is missing when the lowercase
> charset is passed in.  Any ideas?
>
> A semi-hacky solution is adding a toUpperCase() to the character
> encoding in ServletController:228...
>
>             inMessage.put(Message.ENCODING, enc.toUpperCase());
>
> Figured you guys could tell me if this could cause any adverse
> results...
>
>
> Regards,
> Brad



-- 
J. Daniel Kulp
Principal Engineer
IONA
P: 781-902-8727    C: 508-380-7194
daniel.kulp@iona.com
http://www.dankulp.com/blog

Re: UTF-8 vs. utf-8 causes different response..

Posted by Daniel Kulp <dk...@apache.org>.
Brad,

I've made some changes to make sure we try and map any charset values 
that are passed in into the canonical forms the parser may expect.   The 
result is that in your case, the ENCODING does get set to UTF-8 on the 
message instead of utf-8.   Hopefully that solves the problem.

I'll try to deploy a new snapshot sometime this weekend.  (hopefully 
tonight actually)

Dan


On Friday 21 September 2007, Daniel Kulp wrote:
> Brad,
>
> Are you sure you are using woodstox 3.2.1 as the Stax parser?   I
> traced through the woodstox code and the first thing it does is
> convert utf-8 to UTF-8.   Thus, I'm not sure how it could possibly be
> causing an issue.   Still digging though....
>
> Dan
>
> On Wednesday 19 September 2007, Brad Harper wrote:
> > We're doing testing from different soap clients and have narrowed
> > down a discrepancy in our responses based on sending in
> > "charset=utf-8" vs "charset=UTF-8".  The native PHP soap client
> > passes in the former and the namespacing on the child nodes is
> > missing when the lowercase charset is passed in.  Any ideas?
> >
> > A semi-hacky solution is adding a toUpperCase() to the character
> > encoding in ServletController:228...
> >
> >             inMessage.put(Message.ENCODING, enc.toUpperCase());
> >
> > Figured you guys could tell me if this could cause any adverse
> > results...
> >
> >
> > Regards,
> > Brad



-- 
J. Daniel Kulp
Principal Engineer
IONA
P: 781-902-8727    C: 508-380-7194
daniel.kulp@iona.com
http://www.dankulp.com/blog

Re: UTF-8 vs. utf-8 causes different response..

Posted by Daniel Kulp <dk...@apache.org>.
Brad,

Are you sure you are using woodstox 3.2.1 as the Stax parser?   I traced 
through the woodstox code and the first thing it does is convert utf-8 
to UTF-8.   Thus, I'm not sure how it could possibly be causing an 
issue.   Still digging though....

Dan



On Wednesday 19 September 2007, Brad Harper wrote:
> We're doing testing from different soap clients and have narrowed down
> a discrepancy in our responses based on sending in "charset=utf-8" vs
> "charset=UTF-8".  The native PHP soap client passes in the former and
> the namespacing on the child nodes is missing when the lowercase
> charset is passed in.  Any ideas?
>
> A semi-hacky solution is adding a toUpperCase() to the character
> encoding in ServletController:228...
>
>             inMessage.put(Message.ENCODING, enc.toUpperCase());
>
> Figured you guys could tell me if this could cause any adverse
> results...
>
>
> Regards,
> Brad



-- 
J. Daniel Kulp
Principal Engineer
IONA
P: 781-902-8727    C: 508-380-7194
daniel.kulp@iona.com
http://www.dankulp.com/blog