You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Michael B Allen <mb...@ioplex.com> on 2007/03/25 21:09:42 UTC

i18n support

Hello,

I have a module that calls a number of libraries that need to run under
a locale other than 'C'.

What are Apache's limitations with respect to the locale it runs under? I
see that setting LANG=en_US.UTF-8 in Apache's environment works but
LANG=de_DE.ISO-8859-1 does not.

I understand that allowing the server to run in an arbitrary locale
is hard to support but certainly there must be support for i18n text
processing. Where is the line drawn?

Ideally I would like the administrator to be able to choose a locale
(e.g. de_DE.ISO-8859-1@euro) but if I can only do UTF-8 technically I
would be satisfied.

Mike

Re: i18n support

Posted by Michael B Allen <mb...@ioplex.com>.
On Mon, 26 Mar 2007 20:27:17 +0100
Joe Orton <jo...@redhat.com> wrote:

> On Mon, Mar 26, 2007 at 02:26:28PM -0400, Michael B Allen wrote:
> > However, from searching the archives I recall some comments that it was
> > perhaps inappropriate to use anything but the C locale because globally
> > changing the locale might yield non-determinisitic behavior. Is that
> > true? Or is it perfectly save for someone to run in de_DE.ISO-8859-1
> > for example?
> 
> There are a couple of issues here:
> 
> 1) if your module is used in a thread-based server, changing the 
> (process global) locale from one thread will affect all the threads in 
> the process
> 
> 2) there are some places in httpd which use locale-specific string 
> comparisons against strings coming from the user and network.  Changing 
> the locale can have "interesting" effects on those comparisons, 
> depending on the specific locale used.  (the Turkish locale is a nice 
> example, where "i" and "I" don't compare as case-insensitive equals)
> 
> So yes, the general recommendation to not change the locale is a good 
> one.  A library which only works in a *specific* locale is rather 
> awkward.

Ok. Yeah, that's what I thought you were going to say. We're going
to change our code to use custom localization functions. It is very
unfortunate that the multibyte functions (e.g. mbtowc) are poorly designed
such that they cannot accept a context to hold locale state.

Thanks,
Mike

Re: i18n support

Posted by Joe Orton <jo...@redhat.com>.
On Mon, Mar 26, 2007 at 02:26:28PM -0400, Michael B Allen wrote:
> However, from searching the archives I recall some comments that it was
> perhaps inappropriate to use anything but the C locale because globally
> changing the locale might yield non-determinisitic behavior. Is that
> true? Or is it perfectly save for someone to run in de_DE.ISO-8859-1
> for example?

There are a couple of issues here:

1) if your module is used in a thread-based server, changing the 
(process global) locale from one thread will affect all the threads in 
the process

2) there are some places in httpd which use locale-specific string 
comparisons against strings coming from the user and network.  Changing 
the locale can have "interesting" effects on those comparisons, 
depending on the specific locale used.  (the Turkish locale is a nice 
example, where "i" and "I" don't compare as case-insensitive equals)

So yes, the general recommendation to not change the locale is a good 
one.  A library which only works in a *specific* locale is rather 
awkward.

joe

Re: i18n support

Posted by Michael B Allen <mb...@ioplex.com>.
On Mon, 26 Mar 2007 14:00:52 -0400
Michael B Allen <mb...@ioplex.com> wrote:

> On Mon, 26 Mar 2007 12:36:05 -0500
> "William A. Rowe, Jr." <wr...@rowe-clan.net> wrote:
> 
> > I know of no restrictions on a unix-like system that would interfere
> > with choosing any arbitrary charset.  Your post doesn't actually
> > describe what problem you observed after setting LANG=de_DE.ISO-8859-1.
> 
> The charset fails over to ANSI_X3.4-1968 (aka ASCII) and setlocale returns NULL.
> 
> Strangely this actually works on another system I have.
> 
> Anyone know how to install locale files on Debian?

Well after some more digging I managed to generate german locales and
all seems well.

However, from searching the archives I recall some comments that it was
perhaps inappropriate to use anything but the C locale because globally
changing the locale might yield non-determinisitic behavior. Is that
true? Or is it perfectly save for someone to run in de_DE.ISO-8859-1
for example?

Mike

Re: i18n support

Posted by Michael B Allen <mb...@ioplex.com>.
On Mon, 26 Mar 2007 12:36:05 -0500
"William A. Rowe, Jr." <wr...@rowe-clan.net> wrote:

> I know of no restrictions on a unix-like system that would interfere
> with choosing any arbitrary charset.  Your post doesn't actually
> describe what problem you observed after setting LANG=de_DE.ISO-8859-1.

The charset fails over to ANSI_X3.4-1968 (aka ASCII) and setlocale returns NULL.

Strangely this actually works on another system I have.

Anyone know how to install locale files on Debian?

Mike

> Michael B Allen wrote:
> > No one has an answer for this?
> > 
> > On Sun, 25 Mar 2007 15:09:42 -0400
> > Michael B Allen <mb...@ioplex.com> wrote:
> > 
> >> Hello,
> >>
> >> I have a module that calls a number of libraries that need to run under
> >> a locale other than 'C'.
> >>
> >> What are Apache's limitations with respect to the locale it runs under? I
> >> see that setting LANG=en_US.UTF-8 in Apache's environment works but
> >> LANG=de_DE.ISO-8859-1 does not.
> >>
> >> I understand that allowing the server to run in an arbitrary locale
> >> is hard to support but certainly there must be support for i18n text
> >> processing. Where is the line drawn?
> >>
> >> Ideally I would like the administrator to be able to choose a locale
> >> (e.g. de_DE.ISO-8859-1@euro) but if I can only do UTF-8 technically I
> >> would be satisfied.
> > 
> > 
> 


-- 
Michael B Allen
PHP Active Directory Kerberos SSO
http://www.ioplex.com/

Re: i18n support

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
I know of no restrictions on a unix-like system that would interfere
with choosing any arbitrary charset.  Your post doesn't actually
describe what problem you observed after setting LANG=de_DE.ISO-8859-1.

Michael B Allen wrote:
> No one has an answer for this?
> 
> On Sun, 25 Mar 2007 15:09:42 -0400
> Michael B Allen <mb...@ioplex.com> wrote:
> 
>> Hello,
>>
>> I have a module that calls a number of libraries that need to run under
>> a locale other than 'C'.
>>
>> What are Apache's limitations with respect to the locale it runs under? I
>> see that setting LANG=en_US.UTF-8 in Apache's environment works but
>> LANG=de_DE.ISO-8859-1 does not.
>>
>> I understand that allowing the server to run in an arbitrary locale
>> is hard to support but certainly there must be support for i18n text
>> processing. Where is the line drawn?
>>
>> Ideally I would like the administrator to be able to choose a locale
>> (e.g. de_DE.ISO-8859-1@euro) but if I can only do UTF-8 technically I
>> would be satisfied.
> 
> 


Re: i18n support

Posted by Michael B Allen <mb...@ioplex.com>.
No one has an answer for this?

On Sun, 25 Mar 2007 15:09:42 -0400
Michael B Allen <mb...@ioplex.com> wrote:

> Hello,
> 
> I have a module that calls a number of libraries that need to run under
> a locale other than 'C'.
> 
> What are Apache's limitations with respect to the locale it runs under? I
> see that setting LANG=en_US.UTF-8 in Apache's environment works but
> LANG=de_DE.ISO-8859-1 does not.
> 
> I understand that allowing the server to run in an arbitrary locale
> is hard to support but certainly there must be support for i18n text
> processing. Where is the line drawn?
> 
> Ideally I would like the administrator to be able to choose a locale
> (e.g. de_DE.ISO-8859-1@euro) but if I can only do UTF-8 technically I
> would be satisfied.