You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by Hontvari Jozsef <ho...@solware.com> on 2002/08/06 12:24:01 UTC

toUpperCase() is wrong - Be aware of i18n

It just occured me that in the recent patches I saw toUpperCase() calls -
that is without an explicit specification of locale.

Relying on the default locale is almost always incorrect in the case of a
server application.
If you look at the toUpperCase doc in jdk1.4, they give an example showing
how relying on the default locale leads to an incorrect result on a server
with Turkish locale.

Because the SMTP rfc specifies that the commands are in the US-ASCII
charset, this one is a correct call:
toUpperCase(Locale.ENGLISH)






--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: toUpperCase() is wrong - Be aware of i18n

Posted by "Noel J. Bergman" <no...@devtech.com>.
I'm only talking about changing exist uses to toUpperCase() to
toUpperCase(Locale.US) as you suggested.  As for where it is used, I can
check to see if it looks necessary.

As you said, the specification is US-ASCII, so Locale.US seemed more
appropriate than Locale.ENGLISH.

	--- Noel

-----Original Message-----
From: Hontvari Jozsef [mailto:hontvari@solware.com]
Sent: Tuesday, August 06, 2002 17:02
To: James Developers List
Subject: Re: toUpperCase() is wrong - Be aware of i18n


> Shouldn't we should change config.xml and sqlResources.xml as follows:
> config.xml: <config xml:lang="en-US"> ... </config>
> sqlResources.xml: <sqlResources xml:lang="en-US"> ... </sqlResources>
> so that the XML parser knows what locale the files are in?  There are
places
> were String.toUpperCase() is called for data pulled from the XML files,

Are those case conversions are necessary at all? I cannot imagine why they
are needed. There are the "true" and "false" constants, but they are all
known to be in US-ASCII (which in turn implies how to convert the case).
Best if we avoid converting the case of locale specific strings.

>Locale.US

I see you are assuming that the USA won't change its language to Chinese (or
something other) in the foreseeable future :-)



----- Original Message -----
From: "Noel J. Bergman" <no...@devtech.com>
To: "James Developers List" <ja...@jakarta.apache.org>
Sent: Tuesday, August 06, 2002 6:15 PM
Subject: RE: toUpperCase() is wrong - Be aware of i18n


> Ok, you're right.  We should specify the locale.  There are about 20
> references to String.toUpperCase() throughout the CVS sourcebase.  I can
> easily change them all to call String.toUpperCase(Local.US).
>
> Shouldn't we should change config.xml and sqlResources.xml as follows:
>
> config.xml: <config xml:lang="en-US"> ... </config>
>
> sqlResources.xml: <sqlResources xml:lang="en-US"> ... </sqlResources>
>
> so that the XML parser knows what locale the files are in?  There are
places
> were String.toUpperCase() is called for data pulled from the XML files,
> e.g., for database entities.  If someone wants a string to be in a
different
> locale for something, they can explicitly add the xml:lang attribute to
> override the default for that entity (and children).
>
> --- Noel
>
> -----Original Message-----
> From: Hontvari Jozsef [mailto:hontvari@solware.com]
> Sent: Tuesday, August 06, 2002 6:24
> To: James Developers List
> Subject: toUpperCase() is wrong - Be aware of i18n
>
>
> It just occured me that in the recent patches I saw toUpperCase() calls -
> that is without an explicit specification of locale.
>
> Relying on the default locale is almost always incorrect in the case of a
> server application.
> If you look at the toUpperCase doc in jdk1.4, they give an example showing
> how relying on the default locale leads to an incorrect result on a server
> with Turkish locale.
>
> Because the SMTP rfc specifies that the commands are in the US-ASCII
> charset, this one is a correct call:
> toUpperCase(Locale.ENGLISH)
>
>
>
>
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: toUpperCase() is wrong - Be aware of i18n

Posted by Hontvari Jozsef <ho...@solware.com>.
> Shouldn't we should change config.xml and sqlResources.xml as follows:
> config.xml: <config xml:lang="en-US"> ... </config>
> sqlResources.xml: <sqlResources xml:lang="en-US"> ... </sqlResources>
> so that the XML parser knows what locale the files are in?  There are
places
> were String.toUpperCase() is called for data pulled from the XML files,

Are those case conversions are necessary at all? I cannot imagine why they
are needed. There are the "true" and "false" constants, but they are all
known to be in US-ASCII (which in turn implies how to convert the case).
Best if we avoid converting the case of locale specific strings.

>Locale.US

I see you are assuming that the USA won't change its language to Chinese (or
something other) in the foreseeable future :-)



----- Original Message -----
From: "Noel J. Bergman" <no...@devtech.com>
To: "James Developers List" <ja...@jakarta.apache.org>
Sent: Tuesday, August 06, 2002 6:15 PM
Subject: RE: toUpperCase() is wrong - Be aware of i18n


> Ok, you're right.  We should specify the locale.  There are about 20
> references to String.toUpperCase() throughout the CVS sourcebase.  I can
> easily change them all to call String.toUpperCase(Local.US).
>
> Shouldn't we should change config.xml and sqlResources.xml as follows:
>
> config.xml: <config xml:lang="en-US"> ... </config>
>
> sqlResources.xml: <sqlResources xml:lang="en-US"> ... </sqlResources>
>
> so that the XML parser knows what locale the files are in?  There are
places
> were String.toUpperCase() is called for data pulled from the XML files,
> e.g., for database entities.  If someone wants a string to be in a
different
> locale for something, they can explicitly add the xml:lang attribute to
> override the default for that entity (and children).
>
> --- Noel
>
> -----Original Message-----
> From: Hontvari Jozsef [mailto:hontvari@solware.com]
> Sent: Tuesday, August 06, 2002 6:24
> To: James Developers List
> Subject: toUpperCase() is wrong - Be aware of i18n
>
>
> It just occured me that in the recent patches I saw toUpperCase() calls -
> that is without an explicit specification of locale.
>
> Relying on the default locale is almost always incorrect in the case of a
> server application.
> If you look at the toUpperCase doc in jdk1.4, they give an example showing
> how relying on the default locale leads to an incorrect result on a server
> with Turkish locale.
>
> Because the SMTP rfc specifies that the commands are in the US-ASCII
> charset, this one is a correct call:
> toUpperCase(Locale.ENGLISH)
>
>
>
>
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>
>
> --
> To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
> For additional commands, e-mail:
<ma...@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: toUpperCase() is wrong - Be aware of i18n

Posted by "Noel J. Bergman" <no...@devtech.com>.
Ok, you're right.  We should specify the locale.  There are about 20
references to String.toUpperCase() throughout the CVS sourcebase.  I can
easily change them all to call String.toUpperCase(Local.US).

Shouldn't we should change config.xml and sqlResources.xml as follows:

config.xml: <config xml:lang="en-US"> ... </config>

sqlResources.xml: <sqlResources xml:lang="en-US"> ... </sqlResources>

so that the XML parser knows what locale the files are in?  There are places
were String.toUpperCase() is called for data pulled from the XML files,
e.g., for database entities.  If someone wants a string to be in a different
locale for something, they can explicitly add the xml:lang attribute to
override the default for that entity (and children).

	--- Noel

-----Original Message-----
From: Hontvari Jozsef [mailto:hontvari@solware.com]
Sent: Tuesday, August 06, 2002 6:24
To: James Developers List
Subject: toUpperCase() is wrong - Be aware of i18n


It just occured me that in the recent patches I saw toUpperCase() calls -
that is without an explicit specification of locale.

Relying on the default locale is almost always incorrect in the case of a
server application.
If you look at the toUpperCase doc in jdk1.4, they give an example showing
how relying on the default locale leads to an incorrect result on a server
with Turkish locale.

Because the SMTP rfc specifies that the commands are in the US-ASCII
charset, this one is a correct call:
toUpperCase(Locale.ENGLISH)






--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>