You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by Ias <ia...@tmax.co.kr> on 2004/03/02 04:17:26 UTC

[PROPOSAL] synchronizing character encoding between request and response

Hi all,

I'd like to propose simple changes to AxisServlet, Message and
SerializationContextImpl as the subject of this message says. Let me
introduce an example. Currently, Axis returns a SOAP message based on UTF-8
character encoding unless you customize the default differently. If you
send a request message like

<?xml encoding="utf-16"?>
...

to a service deployed to Axis, you can get

<?xml encoding="utf-8"?>
...

as the response of the request.

This mechanism is O.K. , even regarding WS-I BP 1.0 because there' no
requirement for "giving the same character encoding back" in the profile.
However, it's obvious that users expect the character encoding of a request
will be naturally the same with that of its corresponding response since
the request results in the response.

I hope this "synchronizing character encoding" will be the default behavior
of Axis based on SOAP including the character encoding specified by the
"Content-Type" HTTP header. 

At last, these changes passed "ant clean all-tests".

Looking forward to your opinion and comment,

Ias

=========================================================
Lee, Changshin (Korean name)
Ias (International name)
               Company Web Site: http://www.tmax.co.kr
               Personal Web Site: http://www.iasandcb.pe.kr
---------------------------------------------------------
JSR 201, 204, 222 and 224 Expert Group Member
Apache Web Services Project Member
R&D Center
Tmax Soft, Inc.
=========================================================


RE: [PROPOSAL] synchronizing character encoding between request and response

Posted by Richard Martin <rm...@essex.ac.uk>.
If I may weigh in, I don't agree with Simon's suggestion to use HTTP
headers. I believe that doing so would place (yet) another dependency on
using HTTP as the transport mechanism for SOAP messages. By implementing a
natural mirroring system as Ias suggests, character encoding issues become
much simpler for non-HTTP environments. 

I'm amazed that there isn't any mention of a mechanism in the WS-x
specifications to request a specific character encoding of the return
message.

Regards, 

Richard

-----Original Message-----
From: Simon Fell [mailto:soap@zaks.demon.co.uk] 
Sent: 02 March 2004 06:01
To: axis-dev@ws.apache.org
Subject: Re: [PROPOSAL] synchronizing character encoding between request and
response

you misunderstand, the Accpet-Charset header indicates what charset
the client will accept for the response. It has nothing to do with the
content type of the request. This is a HTTP level thing that seems to
map exactly to what you are looking for, why re-invent the wheel ?

Cheers
Simon

On Tue, 2 Mar 2004 13:59:30 +0900, in soap you wrote:

>> Wouldn't it be better for the server to use the 
>> Accept-Charset HTTP header ?
>
>This proposal is not an approach (for problems) about accepting character
>sets. What I'm intending is basically making a response XML message based
on
>the character encoding of its corresponding request XML message. If you
send
>the XML (SOAP) message over HTTP, it's no wonder that charset of
>Content-Type header and encoding of XML declaration in the message are the
>same.
>
>Regards,
>Ias
>
>> 
>> Cheers
>> Simon
>> 
>> On Tue, 2 Mar 2004 12:17:26 +0900, in soap you wrote:
>> 
>> >Hi all,
>> >
>> >I'd like to propose simple changes to AxisServlet, Message and 
>> >SerializationContextImpl as the subject of this message says. Let me 
>> >introduce an example. Currently, Axis returns a SOAP message 
>> based on 
>> >UTF-8 character encoding unless you customize the default 
>> differently. 
>> >If you send a request message like
>> >
>> ><?xml encoding="utf-16"?>
>> >...
>> >
>> >to a service deployed to Axis, you can get
>> >
>> ><?xml encoding="utf-8"?>
>> >...
>> >
>> >as the response of the request.
>> >
>> >This mechanism is O.K. , even regarding WS-I BP 1.0 because 
>> there' no 
>> >requirement for "giving the same character encoding back" in 
>> the profile.
>> >However, it's obvious that users expect the character encoding of a 
>> >request will be naturally the same with that of its corresponding 
>> >response since the request results in the response.
>> >
>> >I hope this "synchronizing character encoding" will be the default 
>> >behavior of Axis based on SOAP including the character encoding 
>> >specified by the "Content-Type" HTTP header.
>> >
>> >At last, these changes passed "ant clean all-tests".
>> >
>> >Looking forward to your opinion and comment,
>> >
>> >Ias
>> >
>> >=========================================================
>> >Lee, Changshin (Korean name)
>> >Ias (International name)
>> >               Company Web Site: http://www.tmax.co.kr
>> >               Personal Web Site: http://www.iasandcb.pe.kr
>> >---------------------------------------------------------
>> >JSR 201, 204, 222 and 224 Expert Group Member Apache Web Services 
>> >Project Member R&D Center Tmax Soft, Inc.
>> >=========================================================
>> 




RE: [PROPOSAL] synchronizing character encoding between request and response

Posted by Davanum Srinivas <di...@yahoo.com>.
Ias,

Please go ahead and commit the changes. That will give folks a bit more to think about and discuss
:)

-- dims

--- Ias <ia...@tmax.co.kr> wrote:
> > you misunderstand, the Accpet-Charset header indicates what 
> > charset the client will accept for the response. It has 
> > nothing to do with the content type of the request. 
> 
> I fully understand the header and didn't say that it had something to do the
> content type of the request. (I rather told you that this proposal was not
> about (the header's meaning of) accepting character sets.)
> 
> > This is a 
> > HTTP level thing that seems to map exactly to what you are 
> > looking for, why re-invent the wheel ?
> 
> Let me explain more specifically. This proposal is essentially designed to
> work at "SOAP (XML) level" and hence not to depend on HTTP. I changed
> org.apache.axis.Message and
> org.apache.axis.encoding.SerializationContextImpl for the purpose. As a
> result, abstractly we can get the same character encoding between request
> and response in terms of XML document.
> 
> request SOAP message: <?xml encoding="x"?>
> ->
> response SOAP message: <?xml encoding="x"?>
> 
> However, in order to transfer SOAP messages over HTTP compliantly, we also
> need to synchronize charset of Content Type header of request and response.
> 
> request HTTP message: Content-Type: text/xml;charset=x
> ->
> response HTTP message: Content-Type: text/xml;charset=x
> 
> This is why I changed AxisServlet. 
> 
> Sorry to lead you to misunderstanding. Again, "XML level" first and "HTTP
> level" followed.
> 
> Ias
> 
> > 
> > Cheers
> > Simon
> > 
> > On Tue, 2 Mar 2004 13:59:30 +0900, in soap you wrote:
> > 
> > >> Wouldn't it be better for the server to use the 
> > Accept-Charset HTTP 
> > >> header ?
> > >
> > >This proposal is not an approach (for problems) about accepting 
> > >character sets. What I'm intending is basically making a 
> > response XML 
> > >message based on the character encoding of its corresponding request 
> > >XML message. If you send the XML (SOAP) message over HTTP, it's no 
> > >wonder that charset of Content-Type header and encoding of XML 
> > >declaration in the message are the same.
> > >
> > >Regards,
> > >Ias
> > >
> > >> 
> > >> Cheers
> > >> Simon
> > >> 
> > >> On Tue, 2 Mar 2004 12:17:26 +0900, in soap you wrote:
> > >> 
> > >> >Hi all,
> > >> >
> > >> >I'd like to propose simple changes to AxisServlet, Message and 
> > >> >SerializationContextImpl as the subject of this message 
> > says. Let me 
> > >> >introduce an example. Currently, Axis returns a SOAP message
> > >> based on
> > >> >UTF-8 character encoding unless you customize the default
> > >> differently. 
> > >> >If you send a request message like
> > >> >
> > >> ><?xml encoding="utf-16"?>
> > >> >...
> > >> >
> > >> >to a service deployed to Axis, you can get
> > >> >
> > >> ><?xml encoding="utf-8"?>
> > >> >...
> > >> >
> > >> >as the response of the request.
> > >> >
> > >> >This mechanism is O.K. , even regarding WS-I BP 1.0 because
> > >> there' no
> > >> >requirement for "giving the same character encoding back" in
> > >> the profile.
> > >> >However, it's obvious that users expect the character 
> > encoding of a 
> > >> >request will be naturally the same with that of its corresponding 
> > >> >response since the request results in the response.
> > >> >
> > >> >I hope this "synchronizing character encoding" will be 
> > the default 
> > >> >behavior of Axis based on SOAP including the character encoding 
> > >> >specified by the "Content-Type" HTTP header.
> > >> >
> > >> >At last, these changes passed "ant clean all-tests".
> > >> >
> > >> >Looking forward to your opinion and comment,
> > >> >
> > >> >Ias
> > >> >
> > >> >=========================================================
> > >> >Lee, Changshin (Korean name)
> > >> >Ias (International name)
> > >> >               Company Web Site: http://www.tmax.co.kr
> > >> >               Personal Web Site: http://www.iasandcb.pe.kr
> > >> >---------------------------------------------------------
> > >> >JSR 201, 204, 222 and 224 Expert Group Member Apache Web Services 
> > >> >Project Member R&D Center Tmax Soft, Inc.
> > >> >=========================================================
> > >> 
> > 
> 


=====
Davanum Srinivas - http://webservices.apache.org/~dims/

RE: [PROPOSAL] synchronizing character encoding between request and response

Posted by Ias <ia...@tmax.co.kr>.
> you misunderstand, the Accpet-Charset header indicates what 
> charset the client will accept for the response. It has 
> nothing to do with the content type of the request. 

I fully understand the header and didn't say that it had something to do the
content type of the request. (I rather told you that this proposal was not
about (the header's meaning of) accepting character sets.)

> This is a 
> HTTP level thing that seems to map exactly to what you are 
> looking for, why re-invent the wheel ?

Let me explain more specifically. This proposal is essentially designed to
work at "SOAP (XML) level" and hence not to depend on HTTP. I changed
org.apache.axis.Message and
org.apache.axis.encoding.SerializationContextImpl for the purpose. As a
result, abstractly we can get the same character encoding between request
and response in terms of XML document.

request SOAP message: <?xml encoding="x"?>
->
response SOAP message: <?xml encoding="x"?>

However, in order to transfer SOAP messages over HTTP compliantly, we also
need to synchronize charset of Content Type header of request and response.

request HTTP message: Content-Type: text/xml;charset=x
->
response HTTP message: Content-Type: text/xml;charset=x

This is why I changed AxisServlet. 

Sorry to lead you to misunderstanding. Again, "XML level" first and "HTTP
level" followed.

Ias

> 
> Cheers
> Simon
> 
> On Tue, 2 Mar 2004 13:59:30 +0900, in soap you wrote:
> 
> >> Wouldn't it be better for the server to use the 
> Accept-Charset HTTP 
> >> header ?
> >
> >This proposal is not an approach (for problems) about accepting 
> >character sets. What I'm intending is basically making a 
> response XML 
> >message based on the character encoding of its corresponding request 
> >XML message. If you send the XML (SOAP) message over HTTP, it's no 
> >wonder that charset of Content-Type header and encoding of XML 
> >declaration in the message are the same.
> >
> >Regards,
> >Ias
> >
> >> 
> >> Cheers
> >> Simon
> >> 
> >> On Tue, 2 Mar 2004 12:17:26 +0900, in soap you wrote:
> >> 
> >> >Hi all,
> >> >
> >> >I'd like to propose simple changes to AxisServlet, Message and 
> >> >SerializationContextImpl as the subject of this message 
> says. Let me 
> >> >introduce an example. Currently, Axis returns a SOAP message
> >> based on
> >> >UTF-8 character encoding unless you customize the default
> >> differently. 
> >> >If you send a request message like
> >> >
> >> ><?xml encoding="utf-16"?>
> >> >...
> >> >
> >> >to a service deployed to Axis, you can get
> >> >
> >> ><?xml encoding="utf-8"?>
> >> >...
> >> >
> >> >as the response of the request.
> >> >
> >> >This mechanism is O.K. , even regarding WS-I BP 1.0 because
> >> there' no
> >> >requirement for "giving the same character encoding back" in
> >> the profile.
> >> >However, it's obvious that users expect the character 
> encoding of a 
> >> >request will be naturally the same with that of its corresponding 
> >> >response since the request results in the response.
> >> >
> >> >I hope this "synchronizing character encoding" will be 
> the default 
> >> >behavior of Axis based on SOAP including the character encoding 
> >> >specified by the "Content-Type" HTTP header.
> >> >
> >> >At last, these changes passed "ant clean all-tests".
> >> >
> >> >Looking forward to your opinion and comment,
> >> >
> >> >Ias
> >> >
> >> >=========================================================
> >> >Lee, Changshin (Korean name)
> >> >Ias (International name)
> >> >               Company Web Site: http://www.tmax.co.kr
> >> >               Personal Web Site: http://www.iasandcb.pe.kr
> >> >---------------------------------------------------------
> >> >JSR 201, 204, 222 and 224 Expert Group Member Apache Web Services 
> >> >Project Member R&D Center Tmax Soft, Inc.
> >> >=========================================================
> >> 
> 


Re: [PROPOSAL] synchronizing character encoding between request and response

Posted by Simon Fell <so...@zaks.demon.co.uk>.
you misunderstand, the Accpet-Charset header indicates what charset
the client will accept for the response. It has nothing to do with the
content type of the request. This is a HTTP level thing that seems to
map exactly to what you are looking for, why re-invent the wheel ?

Cheers
Simon

On Tue, 2 Mar 2004 13:59:30 +0900, in soap you wrote:

>> Wouldn't it be better for the server to use the 
>> Accept-Charset HTTP header ?
>
>This proposal is not an approach (for problems) about accepting character
>sets. What I'm intending is basically making a response XML message based on
>the character encoding of its corresponding request XML message. If you send
>the XML (SOAP) message over HTTP, it's no wonder that charset of
>Content-Type header and encoding of XML declaration in the message are the
>same.
>
>Regards,
>Ias
>
>> 
>> Cheers
>> Simon
>> 
>> On Tue, 2 Mar 2004 12:17:26 +0900, in soap you wrote:
>> 
>> >Hi all,
>> >
>> >I'd like to propose simple changes to AxisServlet, Message and 
>> >SerializationContextImpl as the subject of this message says. Let me 
>> >introduce an example. Currently, Axis returns a SOAP message 
>> based on 
>> >UTF-8 character encoding unless you customize the default 
>> differently. 
>> >If you send a request message like
>> >
>> ><?xml encoding="utf-16"?>
>> >...
>> >
>> >to a service deployed to Axis, you can get
>> >
>> ><?xml encoding="utf-8"?>
>> >...
>> >
>> >as the response of the request.
>> >
>> >This mechanism is O.K. , even regarding WS-I BP 1.0 because 
>> there' no 
>> >requirement for "giving the same character encoding back" in 
>> the profile.
>> >However, it's obvious that users expect the character encoding of a 
>> >request will be naturally the same with that of its corresponding 
>> >response since the request results in the response.
>> >
>> >I hope this "synchronizing character encoding" will be the default 
>> >behavior of Axis based on SOAP including the character encoding 
>> >specified by the "Content-Type" HTTP header.
>> >
>> >At last, these changes passed "ant clean all-tests".
>> >
>> >Looking forward to your opinion and comment,
>> >
>> >Ias
>> >
>> >=========================================================
>> >Lee, Changshin (Korean name)
>> >Ias (International name)
>> >               Company Web Site: http://www.tmax.co.kr
>> >               Personal Web Site: http://www.iasandcb.pe.kr
>> >---------------------------------------------------------
>> >JSR 201, 204, 222 and 224 Expert Group Member Apache Web Services 
>> >Project Member R&D Center Tmax Soft, Inc.
>> >=========================================================
>> 


RE: [PROPOSAL] synchronizing character encoding between request and response

Posted by Ias <ia...@tmax.co.kr>.
> Wouldn't it be better for the server to use the 
> Accept-Charset HTTP header ?

This proposal is not an approach (for problems) about accepting character
sets. What I'm intending is basically making a response XML message based on
the character encoding of its corresponding request XML message. If you send
the XML (SOAP) message over HTTP, it's no wonder that charset of
Content-Type header and encoding of XML declaration in the message are the
same.

Regards,
Ias

> 
> Cheers
> Simon
> 
> On Tue, 2 Mar 2004 12:17:26 +0900, in soap you wrote:
> 
> >Hi all,
> >
> >I'd like to propose simple changes to AxisServlet, Message and 
> >SerializationContextImpl as the subject of this message says. Let me 
> >introduce an example. Currently, Axis returns a SOAP message 
> based on 
> >UTF-8 character encoding unless you customize the default 
> differently. 
> >If you send a request message like
> >
> ><?xml encoding="utf-16"?>
> >...
> >
> >to a service deployed to Axis, you can get
> >
> ><?xml encoding="utf-8"?>
> >...
> >
> >as the response of the request.
> >
> >This mechanism is O.K. , even regarding WS-I BP 1.0 because 
> there' no 
> >requirement for "giving the same character encoding back" in 
> the profile.
> >However, it's obvious that users expect the character encoding of a 
> >request will be naturally the same with that of its corresponding 
> >response since the request results in the response.
> >
> >I hope this "synchronizing character encoding" will be the default 
> >behavior of Axis based on SOAP including the character encoding 
> >specified by the "Content-Type" HTTP header.
> >
> >At last, these changes passed "ant clean all-tests".
> >
> >Looking forward to your opinion and comment,
> >
> >Ias
> >
> >=========================================================
> >Lee, Changshin (Korean name)
> >Ias (International name)
> >               Company Web Site: http://www.tmax.co.kr
> >               Personal Web Site: http://www.iasandcb.pe.kr
> >---------------------------------------------------------
> >JSR 201, 204, 222 and 224 Expert Group Member Apache Web Services 
> >Project Member R&D Center Tmax Soft, Inc.
> >=========================================================
> 


Re: [PROPOSAL] synchronizing character encoding between request and response

Posted by Simon Fell <so...@zaks.demon.co.uk>.
Wouldn't it be better for the server to use the Accept-Charset HTTP
header ?

Cheers
Simon

On Tue, 2 Mar 2004 12:17:26 +0900, in soap you wrote:

>Hi all,
>
>I'd like to propose simple changes to AxisServlet, Message and
>SerializationContextImpl as the subject of this message says. Let me
>introduce an example. Currently, Axis returns a SOAP message based on UTF-8
>character encoding unless you customize the default differently. If you
>send a request message like
>
><?xml encoding="utf-16"?>
>...
>
>to a service deployed to Axis, you can get
>
><?xml encoding="utf-8"?>
>...
>
>as the response of the request.
>
>This mechanism is O.K. , even regarding WS-I BP 1.0 because there' no
>requirement for "giving the same character encoding back" in the profile.
>However, it's obvious that users expect the character encoding of a request
>will be naturally the same with that of its corresponding response since
>the request results in the response.
>
>I hope this "synchronizing character encoding" will be the default behavior
>of Axis based on SOAP including the character encoding specified by the
>"Content-Type" HTTP header. 
>
>At last, these changes passed "ant clean all-tests".
>
>Looking forward to your opinion and comment,
>
>Ias
>
>=========================================================
>Lee, Changshin (Korean name)
>Ias (International name)
>               Company Web Site: http://www.tmax.co.kr
>               Personal Web Site: http://www.iasandcb.pe.kr
>---------------------------------------------------------
>JSR 201, 204, 222 and 224 Expert Group Member
>Apache Web Services Project Member
>R&D Center
>Tmax Soft, Inc.
>=========================================================


Re: [PROPOSAL] synchronizing character encoding between request and response

Posted by Steve Loughran <st...@iseran.com>.
Ias wrote:
> Hi all,
> 
> I'd like to propose simple changes to AxisServlet, Message and
> SerializationContextImpl as the subject of this message says. Let me
> introduce an example. Currently, Axis returns a SOAP message based on UTF-8
> character encoding unless you customize the default differently. If you
> send a request message like
> 
> <?xml encoding="utf-16"?>
> ...
> 
> to a service deployed to Axis, you can get
> 
> <?xml encoding="utf-8"?>
> ...
> 
> as the response of the request.
> 
> This mechanism is O.K. , even regarding WS-I BP 1.0 because there' no
> requirement for "giving the same character encoding back" in the profile.
> However, it's obvious that users expect the character encoding of a request
> will be naturally the same with that of its corresponding response since
> the request results in the response.
> 
> I hope this "synchronizing character encoding" will be the default behavior
> of Axis based on SOAP including the character encoding specified by the
> "Content-Type" HTTP header. 
> 
> At last, these changes passed "ant clean all-tests".
> 
> Looking forward to your opinion and comment,
> 

+1, provided it is only UTF-8 and UTF-16 that we are switching between. 
Something (I think WS-I) says that only these should be supported; 
adding more complicates so much.

Note that while UTF-16 encoded korean will by much more efficient than 
utf-8, if you send binary stuff back inline in base-64 encoding, the 
binary data is 50% less efficient, as each base-64 char now takes up two 
bytes. So there may be a need for an endpoint to override the charset 
coming in over the wire.

Re: [PROPOSAL] synchronizing character encoding between request and response

Posted by Davanum Srinivas <di...@yahoo.com>.
+1 from me. Go for it.

-- dims

--- Ias <ia...@tmax.co.kr> wrote:
> Hi all,
> 
> I'd like to propose simple changes to AxisServlet, Message and
> SerializationContextImpl as the subject of this message says. Let me
> introduce an example. Currently, Axis returns a SOAP message based on UTF-8
> character encoding unless you customize the default differently. If you
> send a request message like
> 
> <?xml encoding="utf-16"?>
> ...
> 
> to a service deployed to Axis, you can get
> 
> <?xml encoding="utf-8"?>
> ...
> 
> as the response of the request.
> 
> This mechanism is O.K. , even regarding WS-I BP 1.0 because there' no
> requirement for "giving the same character encoding back" in the profile.
> However, it's obvious that users expect the character encoding of a request
> will be naturally the same with that of its corresponding response since
> the request results in the response.
> 
> I hope this "synchronizing character encoding" will be the default behavior
> of Axis based on SOAP including the character encoding specified by the
> "Content-Type" HTTP header. 
> 
> At last, these changes passed "ant clean all-tests".
> 
> Looking forward to your opinion and comment,
> 
> Ias
> 
> =========================================================
> Lee, Changshin (Korean name)
> Ias (International name)
>                Company Web Site: http://www.tmax.co.kr
>                Personal Web Site: http://www.iasandcb.pe.kr
> ---------------------------------------------------------
> JSR 201, 204, 222 and 224 Expert Group Member
> Apache Web Services Project Member
> R&D Center
> Tmax Soft, Inc.
> =========================================================
> 


=====
Davanum Srinivas - http://webservices.apache.org/~dims/