You are viewing a plain text version of this content. The canonical link for it is here.
Posted to soap-user@ws.apache.org by Clive Jordan <cl...@openwave.com> on 2005/06/16 11:49:14 UTC

SOAP 2.3.1 and Internationalization

Hi Guys,

Apologies if you have seen this question many times before...

I am using Apache SOAP 2.3.1 with JDK 1.4 running on AIX 5.2.

All is setup and working ok.

If I send a simple xml into message into the system, all is fine.

Each message starts:
<?xml version="1.0" encoding="UTF-8" ?>

We are now required to send data that contains non-ascii characters 
(specifically, european accented letters). These are only in the tag 
data, not the tags themselves.

So I changed the message header to:
<?xml version="1.0" encoding="iso-8859-1" ?>

This works fine.

Now the xml data was change to include an accented letter ( passÜxx). 
When I send this data into the server, I get the following response:

<?xml version='1.0' encoding='UTF-8'?>
<SOAP-ENV:Envelope 
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<SOAP-ENV:Body>
<SOAP-ENV:Fault>
<faultcode>SOAP-ENV:Client</faultcode>
<faultstring>parsing error: org.xml.sax.SAXParseException: XML document 
structures must start and end within the same entity.</faultstring>
<faultactor>/soap/servlet/messagerouter</faultactor>
</SOAP-ENV:Fault>

</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

This is being generated by the SOAP layer before it gets to the web service.

I have written a simple java program that parses theis file using the 
built-in JAXP and that works fine. So I assume that the SOAP layer is 
using an older parser that does not upport internationalization?

I cannot find xerces in any of the tomcat library directories so I am 
puzzeled as to which parser is being used.

Is there a way I can find out what parser is being used and can I slide 
in a newer one ?

Upgrading to Axis is not an option.

Any help gratefully received.
Thanks,
Clive

Re: SOAP 2.3.1 and Internationalization

Posted by Scott Nichol <sn...@scottnichol.com>.
The HTTP Content-Type header must also specify the correct character encoding.  In this case, please be sure that you have

    Content-Type: text/xml; charset=iso-8859-1

in the request that carries the payload.  Apache SOAP uses the charset specified in the Content-Type when converting the stream of bytes in the HTTP request entity to a Java string.

Scott Nichol

Do not send e-mail directly to this e-mail address,
because it is filtered to accept only mail from
specific mail lists.
----- Original Message ----- 
From: "Clive Jordan" <cl...@openwave.com>
To: <so...@ws.apache.org>
Sent: Thursday, June 16, 2005 5:49 AM
Subject: SOAP 2.3.1 and Internationalization


> Hi Guys,
> 
> Apologies if you have seen this question many times before...
> 
> I am using Apache SOAP 2.3.1 with JDK 1.4 running on AIX 5.2.
> 
> All is setup and working ok.
> 
> If I send a simple xml into message into the system, all is fine.
> 
> Each message starts:
> <?xml version="1.0" encoding="UTF-8" ?>
> 
> We are now required to send data that contains non-ascii characters 
> (specifically, european accented letters). These are only in the tag 
> data, not the tags themselves.
> 
> So I changed the message header to:
> <?xml version="1.0" encoding="iso-8859-1" ?>
> 
> This works fine.
> 
> Now the xml data was change to include an accented letter ( passÜxx). 
> When I send this data into the server, I get the following response:
> 
> <?xml version='1.0' encoding='UTF-8'?>
> <SOAP-ENV:Envelope 
> xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" 
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
> xmlns:xsd="http://www.w3.org/2001/XMLSchema">
> <SOAP-ENV:Body>
> <SOAP-ENV:Fault>
> <faultcode>SOAP-ENV:Client</faultcode>
> <faultstring>parsing error: org.xml.sax.SAXParseException: XML document 
> structures must start and end within the same entity.</faultstring>
> <faultactor>/soap/servlet/messagerouter</faultactor>
> </SOAP-ENV:Fault>
> 
> </SOAP-ENV:Body>
> </SOAP-ENV:Envelope>
> 
> This is being generated by the SOAP layer before it gets to the web service.
> 
> I have written a simple java program that parses theis file using the 
> built-in JAXP and that works fine. So I assume that the SOAP layer is 
> using an older parser that does not upport internationalization?
> 
> I cannot find xerces in any of the tomcat library directories so I am 
> puzzeled as to which parser is being used.
> 
> Is there a way I can find out what parser is being used and can I slide 
> in a newer one ?
> 
> Upgrading to Axis is not an option.
> 
> Any help gratefully received.
> Thanks,
> Clive
>