You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-user@axis.apache.org by Tim Bartley <tb...@au1.ibm.com> on 2005/02/28 03:11:19 UTC
Multi-byte locale limitations?
Hi,
A little while ago a thread was on the list alluding to issues with
running Axis in processes running in non-UTF8 or single-byte locales. Can
some one please elaborate on what limitations Axis has running in such
locales - particularly on the client side.
Maybe I have misinterpreted the issue - perhaps it was more about
serializing/deserializing string data encoded in non-UTF-8 character sets
- this would be no issue for me - it's easy enough for me to ensure
everything going in or out is encoded in UTF8.
Thanks and regards,
Tim
--
IBM Tivoli Access Manager Development
Gold Coast Development Lab, Australia
+61-7-5552-4001 phone
+61-7-5571-0420 fax
Re: Multi-byte locale limitations?
Posted by Toshiyuki Kimura <to...@apache.org>.
If you make a custom serializer/deserializer for any locales
to encode to ISO-8859-1, you will be able to send/receive any
multi-byte characters on your client. But even if in this case,
you'd better use - '<? xml version="1.0" encoding="utf-8" ?>',
due to the specs of XML, Web Services.
In the Axis (Java), we can use any multi-byte characters - such
as Japanese, Chinese and Korean, if you use UTF-8 as the encoding.
Please see the related information below:
<http://issues.apache.org/jira/browse/AXIS-1815?page=all>
Thanks,
Toshi
On Sun, 27 Feb 2005, Nadir Amra wrote:
> A lot of the string functions depend on the locale.
>
> I do not have a definitate answer, but from what I know, things will work
> if you are running in a locale that has a character set that is the same
> in UTF-8. This, running ISO-8859-1 locale will work since the character
> set is the same in UTF-8.
>
> However, I would not be confident if a process was running in japanese or
> chinese locale, even if the data is in UTF-8. But others may have a
> better handle on this. It may work.
>
>
> Tim Bartley <tb...@au1.ibm.com> wrote on 02/27/2005 09:24:29 PM:
>
>>
>> Thanks,
>>
>> If I ensure that everything I pass in or out of Axis is UTF-8 will I
>> still encounter problems if the process itself is running in a non-
>> UTF-8 (or subset) locale? Why does the process locale even matter?
>>
>> Tim
>> --
>> IBM Tivoli Access Manager Development
>> Gold Coast Development Lab, Australia
>> +61-7-5552-4001 phone
>> +61-7-5571-0420 fax
>>
>> Nadir Amra <am...@us.ibm.com>
>> 28/02/2005 13:18
>>
>> Please respond to
>> "Apache AXIS C User List"
>>
>> To
>>
>> "Apache AXIS C User List" <ax...@ws.apache.org>
>>
>> cc
>>
>> Subject
>>
>> Re: Multi-byte locale limitations?
>>
>>
>>
>>
>> The AXIS code currently assumes that everything coming in and everything
>
>> going out is UTF-8. Thus, if you are running in a locale that is a
> subset
>> of UTF-8 (e.g. ISO-8859-1 character set), then you should be OK.
> However,
>> if you are running in a locale such as japanese or chinese, then you are
>
>> out of luck.
>>
>> This is something that will hopefully be fixed in the near future.
>>
>> Tim Bartley <tb...@au1.ibm.com> wrote on 02/27/2005 08:11:19 PM:
>>
>>>
>>> Hi,
>>>
>>> A little while ago a thread was on the list alluding to issues with
>>> running Axis in processes running in non-UTF8 or single-byte
>>> locales. Can some one please elaborate on what limitations Axis has
>>> running in such locales - particularly on the client side.
>>>
>>> Maybe I have misinterpreted the issue - perhaps it was more about
>>> serializing/deserializing string data encoded in non-UTF-8 character
>>> sets - this would be no issue for me - it's easy enough for me to
>>> ensure everything going in or out is encoded in UTF8.
>>>
>>> Thanks and regards,
>>>
>>> Tim
>>> --
>>> IBM Tivoli Access Manager Development
>>> Gold Coast Development Lab, Australia
>>> +61-7-5552-4001 phone
>>> +61-7-5571-0420 fax
>
Re: Multi-byte locale limitations?
Posted by John Hawkins <ha...@uk.ibm.com>.
And what if the server is not UTF8? I guess it's supposed to send back
UTF8 because it got UTF coming in?
Nadir Amra <am...@us.ibm.com>
28/02/2005 03:31
Please respond to
"Apache AXIS C User List"
To
"Apache AXIS C User List" <ax...@ws.apache.org>
cc
Subject
Re: Multi-byte locale limitations?
A lot of the string functions depend on the locale.
I do not have a definitate answer, but from what I know, things will work
if you are running in a locale that has a character set that is the same
in UTF-8. This, running ISO-8859-1 locale will work since the character
set is the same in UTF-8.
However, I would not be confident if a process was running in japanese or
chinese locale, even if the data is in UTF-8. But others may have a
better handle on this. It may work.
Tim Bartley <tb...@au1.ibm.com> wrote on 02/27/2005 09:24:29 PM:
>
> Thanks,
>
> If I ensure that everything I pass in or out of Axis is UTF-8 will I
> still encounter problems if the process itself is running in a non-
> UTF-8 (or subset) locale? Why does the process locale even matter?
>
> Tim
> --
> IBM Tivoli Access Manager Development
> Gold Coast Development Lab, Australia
> +61-7-5552-4001 phone
> +61-7-5571-0420 fax
>
>
> Nadir Amra <am...@us.ibm.com>
> 28/02/2005 13:18
>
> Please respond to
> "Apache AXIS C User List"
>
> To
>
> "Apache AXIS C User List" <ax...@ws.apache.org>
>
> cc
>
> Subject
>
> Re: Multi-byte locale limitations?
>
>
>
>
> The AXIS code currently assumes that everything coming in and everything
> going out is UTF-8. Thus, if you are running in a locale that is a
subset
> of UTF-8 (e.g. ISO-8859-1 character set), then you should be OK.
However,
> if you are running in a locale such as japanese or chinese, then you are
> out of luck.
>
> This is something that will hopefully be fixed in the near future.
>
> Tim Bartley <tb...@au1.ibm.com> wrote on 02/27/2005 08:11:19 PM:
>
> >
> > Hi,
> >
> > A little while ago a thread was on the list alluding to issues with
> > running Axis in processes running in non-UTF8 or single-byte
> > locales. Can some one please elaborate on what limitations Axis has
> > running in such locales - particularly on the client side.
> >
> > Maybe I have misinterpreted the issue - perhaps it was more about
> > serializing/deserializing string data encoded in non-UTF-8 character
> > sets - this would be no issue for me - it's easy enough for me to
> > ensure everything going in or out is encoded in UTF8.
> >
> > Thanks and regards,
> >
> > Tim
> > --
> > IBM Tivoli Access Manager Development
> > Gold Coast Development Lab, Australia
> > +61-7-5552-4001 phone
> > +61-7-5571-0420 fax
Re: Multi-byte locale limitations?
Posted by Nadir Amra <am...@us.ibm.com>.
A lot of the string functions depend on the locale.
I do not have a definitate answer, but from what I know, things will work
if you are running in a locale that has a character set that is the same
in UTF-8. This, running ISO-8859-1 locale will work since the character
set is the same in UTF-8.
However, I would not be confident if a process was running in japanese or
chinese locale, even if the data is in UTF-8. But others may have a
better handle on this. It may work.
Tim Bartley <tb...@au1.ibm.com> wrote on 02/27/2005 09:24:29 PM:
>
> Thanks,
>
> If I ensure that everything I pass in or out of Axis is UTF-8 will I
> still encounter problems if the process itself is running in a non-
> UTF-8 (or subset) locale? Why does the process locale even matter?
>
> Tim
> --
> IBM Tivoli Access Manager Development
> Gold Coast Development Lab, Australia
> +61-7-5552-4001 phone
> +61-7-5571-0420 fax
>
>
> Nadir Amra <am...@us.ibm.com>
> 28/02/2005 13:18
>
> Please respond to
> "Apache AXIS C User List"
>
> To
>
> "Apache AXIS C User List" <ax...@ws.apache.org>
>
> cc
>
> Subject
>
> Re: Multi-byte locale limitations?
>
>
>
>
> The AXIS code currently assumes that everything coming in and everything
> going out is UTF-8. Thus, if you are running in a locale that is a
subset
> of UTF-8 (e.g. ISO-8859-1 character set), then you should be OK.
However,
> if you are running in a locale such as japanese or chinese, then you are
> out of luck.
>
> This is something that will hopefully be fixed in the near future.
>
> Tim Bartley <tb...@au1.ibm.com> wrote on 02/27/2005 08:11:19 PM:
>
> >
> > Hi,
> >
> > A little while ago a thread was on the list alluding to issues with
> > running Axis in processes running in non-UTF8 or single-byte
> > locales. Can some one please elaborate on what limitations Axis has
> > running in such locales - particularly on the client side.
> >
> > Maybe I have misinterpreted the issue - perhaps it was more about
> > serializing/deserializing string data encoded in non-UTF-8 character
> > sets - this would be no issue for me - it's easy enough for me to
> > ensure everything going in or out is encoded in UTF8.
> >
> > Thanks and regards,
> >
> > Tim
> > --
> > IBM Tivoli Access Manager Development
> > Gold Coast Development Lab, Australia
> > +61-7-5552-4001 phone
> > +61-7-5571-0420 fax
Re: Multi-byte locale limitations?
Posted by Tim Bartley <tb...@au1.ibm.com>.
Thanks,
If I ensure that everything I pass in or out of Axis is UTF-8 will I still
encounter problems if the process itself is running in a non-UTF-8 (or
subset) locale? Why does the process locale even matter?
Tim
--
IBM Tivoli Access Manager Development
Gold Coast Development Lab, Australia
+61-7-5552-4001 phone
+61-7-5571-0420 fax
Nadir Amra <am...@us.ibm.com>
28/02/2005 13:18
Please respond to
"Apache AXIS C User List"
To
"Apache AXIS C User List" <ax...@ws.apache.org>
cc
Subject
Re: Multi-byte locale limitations?
The AXIS code currently assumes that everything coming in and everything
going out is UTF-8. Thus, if you are running in a locale that is a subset
of UTF-8 (e.g. ISO-8859-1 character set), then you should be OK. However,
if you are running in a locale such as japanese or chinese, then you are
out of luck.
This is something that will hopefully be fixed in the near future.
Tim Bartley <tb...@au1.ibm.com> wrote on 02/27/2005 08:11:19 PM:
>
> Hi,
>
> A little while ago a thread was on the list alluding to issues with
> running Axis in processes running in non-UTF8 or single-byte
> locales. Can some one please elaborate on what limitations Axis has
> running in such locales - particularly on the client side.
>
> Maybe I have misinterpreted the issue - perhaps it was more about
> serializing/deserializing string data encoded in non-UTF-8 character
> sets - this would be no issue for me - it's easy enough for me to
> ensure everything going in or out is encoded in UTF8.
>
> Thanks and regards,
>
> Tim
> --
> IBM Tivoli Access Manager Development
> Gold Coast Development Lab, Australia
> +61-7-5552-4001 phone
> +61-7-5571-0420 fax
Re: Multi-byte locale limitations?
Posted by Nadir Amra <am...@us.ibm.com>.
The AXIS code currently assumes that everything coming in and everything
going out is UTF-8. Thus, if you are running in a locale that is a subset
of UTF-8 (e.g. ISO-8859-1 character set), then you should be OK. However,
if you are running in a locale such as japanese or chinese, then you are
out of luck.
This is something that will hopefully be fixed in the near future.
Tim Bartley <tb...@au1.ibm.com> wrote on 02/27/2005 08:11:19 PM:
>
> Hi,
>
> A little while ago a thread was on the list alluding to issues with
> running Axis in processes running in non-UTF8 or single-byte
> locales. Can some one please elaborate on what limitations Axis has
> running in such locales - particularly on the client side.
>
> Maybe I have misinterpreted the issue - perhaps it was more about
> serializing/deserializing string data encoded in non-UTF-8 character
> sets - this would be no issue for me - it's easy enough for me to
> ensure everything going in or out is encoded in UTF8.
>
> Thanks and regards,
>
> Tim
> --
> IBM Tivoli Access Manager Development
> Gold Coast Development Lab, Australia
> +61-7-5552-4001 phone
> +61-7-5571-0420 fax