You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-user@axis.apache.org by Iwan Tomlow <iw...@seagha.com> on 2007/03/28 13:22:00 UTC
Invalid byte 2 of 2-byte UTF-8 sequence?
Hi,
when connecting an Axis-C client (v1.5) to an Axis-java webservice, the return message has the following error:
<soapenv:Fault>
<faultcode>soapenv:Server.userException</faultcode>
<faultstring>java.io.UTFDataFormatException: Invalid byte 2 of 2-byte UTF-8 sequence.</faultstring>
The only difference with all other working requests, seems the company name containing "GÜTER".
When debugging, the U-umlaut is showing correctly everywhere, but apparently the receiving webservice isn't getting it correctly.
The source message generated by Axis-C++ looks like this in TCPMonitor:
Content-Type: text/xml; charset=UTF-8
<?xml version='1.0' encoding='utf-8' ?>
...
<ns1:company>ROTTMANN HEINRICH G TER</ns1:company>
So it seems the character is indeed being encoded incorrectly by Axis-C?
Does anyone have any hints to get this working?
By the way, I receive this company-name via another webservice, and I notice that they sent me this:
<employerName>ROTTMANN HEINRICH GÜTER</employerName>
Maybe there is a way to configure Axis-C to do the same, sending a character reference to avoid the encoding-mismatch?
Kind regards,
Iwan Tomlow
---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-user-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-user-help@ws.apache.org
RE: Invalid byte 2 of 2-byte UTF-8 sequence?
Posted by Nadir Amra <am...@us.ibm.com>.
Iwan,
It would be a fix if you were running only in a process that is consistent
with ISO-8859-1 encoding, but that is not the case. From the client-side,
the easiest fix would be to provide a toUTF8() function as OS/400 does.
The proper fix for client/server side is to ensure all data stored in
various classes is wchar and to translate wchar to utf8.
Nadir K. Amra
"Iwan Tomlow" <iw...@seagha.com> wrote on 03/29/2007 09:04:32 AM:
> Thanks for pointing that out, I should have been able to find that one
myself.
> Most reasonable thing for me to do seems to apply the proposed
> workaround from AXISCPP-964, i.e.
>
> In SoapSerializer.cpp, turn
> serialize( "<?xml version='1.0' encoding='utf-8' ?>", NULL);
> into
> serialize( "<?xml version='1.0' encoding='ISO-8859-1' ?>", NULL);
>
> I'm no expert on character-encoding, but since "AxisChar" is defined
> in Gdefine.hpp as a normal "char" anyway, isn't this simply the best
> way to fix the issue completely?
>
> Kind regards,
> Iwan
>
>
> -----Original Message-----
> From: Nadir Amra [mailto:amra@us.ibm.com]
> Sent: donderdag 29 maart 2007 0:52
> To: Apache AXIS C User List
> Cc: Apache AXIS C User List
> Subject: Re: Invalid byte 2 of 2-byte UTF-8 sequence?
>
> Iwan,
>
> This is an existing problem. See AXISCPP-964. If anyone want to
> provide a patch, I can include the patch. If you have a solution to
> fix your particular problem, then please provide the patch.
>
> Nadir K. Amra
>
>
> "Iwan Tomlow" <iw...@seagha.com> wrote on 03/28/2007 06:22:00 AM:
>
> > Hi,
> >
> > when connecting an Axis-C client (v1.5) to an Axis-java webservice,
> > the return message has the following error:
> >
> > <soapenv:Fault>
> > <faultcode>soapenv:Server.userException</faultcode>
> > <faultstring>java.io.UTFDataFormatException: Invalid byte 2 of 2-
> > byte UTF-8 sequence.</faultstring>
> >
> > The only difference with all other working requests, seems the company
> > name containing "GÜTER".
> > When debugging, the U-umlaut is showing correctly everywhere, but
> > apparently the receiving webservice isn't getting it correctly.
> > The source message generated by Axis-C++ looks like this in
TCPMonitor:
> >
> > Content-Type: text/xml; charset=UTF-8
> >
> > <?xml version='1.0' encoding='utf-8' ?> ...
> > <ns1:company>ROTTMANN HEINRICH G TER</ns1:company>
> >
> > So it seems the character is indeed being encoded incorrectly by
Axis-C?
> >
> >
> > Does anyone have any hints to get this working?
> >
> > By the way, I receive this company-name via another webservice, and I
> > notice that they sent me this:
> >
> > <employerName>ROTTMANN HEINRICH GÜTER</employerName>
> >
> > Maybe there is a way to configure Axis-C to do the same, sending a
> > character reference to avoid the encoding-mismatch?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: axis-c-user-unsubscribe@ws.apache.org
> For additional commands, e-mail: axis-c-user-help@ws.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: axis-c-user-unsubscribe@ws.apache.org
> For additional commands, e-mail: axis-c-user-help@ws.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-user-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-user-help@ws.apache.org
RE: Invalid byte 2 of 2-byte UTF-8 sequence?
Posted by Iwan Tomlow <iw...@seagha.com>.
Thanks for pointing that out, I should have been able to find that one myself.
Most reasonable thing for me to do seems to apply the proposed workaround from AXISCPP-964, i.e.
In SoapSerializer.cpp, turn
serialize( "<?xml version='1.0' encoding='utf-8' ?>", NULL);
into
serialize( "<?xml version='1.0' encoding='ISO-8859-1' ?>", NULL);
I'm no expert on character-encoding, but since "AxisChar" is defined in Gdefine.hpp as a normal "char" anyway, isn't this simply the best way to fix the issue completely?
Kind regards,
Iwan
-----Original Message-----
From: Nadir Amra [mailto:amra@us.ibm.com]
Sent: donderdag 29 maart 2007 0:52
To: Apache AXIS C User List
Cc: Apache AXIS C User List
Subject: Re: Invalid byte 2 of 2-byte UTF-8 sequence?
Iwan,
This is an existing problem. See AXISCPP-964. If anyone want to provide a patch, I can include the patch. If you have a solution to fix your particular problem, then please provide the patch.
Nadir K. Amra
"Iwan Tomlow" <iw...@seagha.com> wrote on 03/28/2007 06:22:00 AM:
> Hi,
>
> when connecting an Axis-C client (v1.5) to an Axis-java webservice,
> the return message has the following error:
>
> <soapenv:Fault>
> <faultcode>soapenv:Server.userException</faultcode>
> <faultstring>java.io.UTFDataFormatException: Invalid byte 2 of 2-
> byte UTF-8 sequence.</faultstring>
>
> The only difference with all other working requests, seems the company
> name containing "GÜTER".
> When debugging, the U-umlaut is showing correctly everywhere, but
> apparently the receiving webservice isn't getting it correctly.
> The source message generated by Axis-C++ looks like this in TCPMonitor:
>
> Content-Type: text/xml; charset=UTF-8
>
> <?xml version='1.0' encoding='utf-8' ?> ...
> <ns1:company>ROTTMANN HEINRICH G TER</ns1:company>
>
> So it seems the character is indeed being encoded incorrectly by Axis-C?
>
>
> Does anyone have any hints to get this working?
>
> By the way, I receive this company-name via another webservice, and I
> notice that they sent me this:
>
> <employerName>ROTTMANN HEINRICH GÜTER</employerName>
>
> Maybe there is a way to configure Axis-C to do the same, sending a
> character reference to avoid the encoding-mismatch?
---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-user-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-user-help@ws.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-user-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-user-help@ws.apache.org
Re: Invalid byte 2 of 2-byte UTF-8 sequence?
Posted by Nadir Amra <am...@us.ibm.com>.
Iwan,
This is an existing problem. See AXISCPP-964. If anyone want to provide
a patch, I can include the patch. If you have a solution to fix your
particular problem, then please provide the patch.
Nadir K. Amra
"Iwan Tomlow" <iw...@seagha.com> wrote on 03/28/2007 06:22:00 AM:
> Hi,
>
> when connecting an Axis-C client (v1.5) to an Axis-java webservice,
> the return message has the following error:
>
> <soapenv:Fault>
> <faultcode>soapenv:Server.userException</faultcode>
> <faultstring>java.io.UTFDataFormatException: Invalid byte 2 of 2-
> byte UTF-8 sequence.</faultstring>
>
> The only difference with all other working requests, seems the
> company name containing "GÜTER".
> When debugging, the U-umlaut is showing correctly everywhere, but
> apparently the receiving webservice isn't getting it correctly.
> The source message generated by Axis-C++ looks like this in TCPMonitor:
>
> Content-Type: text/xml; charset=UTF-8
>
> <?xml version='1.0' encoding='utf-8' ?>
> ...
> <ns1:company>ROTTMANN HEINRICH G TER</ns1:company>
>
> So it seems the character is indeed being encoded incorrectly by Axis-C?
>
>
> Does anyone have any hints to get this working?
>
> By the way, I receive this company-name via another webservice, and
> I notice that they sent me this:
>
> <employerName>ROTTMANN HEINRICH GÜTER</employerName>
>
> Maybe there is a way to configure Axis-C to do the same, sending a
> character reference to avoid the encoding-mismatch?
---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-user-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-user-help@ws.apache.org