You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Andrew Sudell <as...@Op.Net> on 2001/08/22 01:29:37 UTC

Re: Problems with German ä, ö,

"Frank Neumann"
> 
> Hi folks,
> 
> I'm experiencing problems when using the German letters ä ö ü ß in
> URL's.
> 
> first box: Solaris 2.6, JDK 1.3.0, tomcat 3.2.1
> tomcat was installed as binary distribution by a developer from another
> company. When using correctly URL-encoded ä ö ü ß there is no problem.
> The servlets work fine.
> 
> second box: Solaris 7, JDK 1.3.1, tomcat 3.2.1
> I compiled and installed tomcat by myself. When using correctly
> URL-encoded ä ö ü ß they doesn't seem to get correctly to the servlets
> and the servlets fail. The resulting URL contains a ? instead of the
> original encoded letter.
> 
> I assume this has something to do with localization. The locales on both
> boxes are identical. My question is how to configure tomcat to work
> correctly.
> 

Not sure if this is it, but I'll give you something to look at.

Try looking at the system property file.encoding.  As far as I can tell
(I'm working on a similar problem at the moment):
 
 o the loader for the jvm examines the C locale and sets file.encoding
 o on solaris 2.6, in the "C" locale (LANG unset), the file encoding
   is latin 1 (ISO8859-1).
 o on solaris 2.7, in the "C" locale, the file encoding is ascii (646),
   but in the "en_US" (LANG="en_US") the file encoding is ISO8859-1.

or at least those are the results I get on the 1.2 jvm's I have loaded.

If I had to guess, the story is something like a C locale implies an
encoding, a Java locale does not (makes sense everything is UCS-16 
anyway), you can spec an encoding for Java to use when transcoding
external multi-byte to widechar (look at InputStreamReader) but
it defaults based on the external C locale (ie how it expects your
files to work).

Figuring out the default encoding difference will at least explain
the difference between the boxes (assuming that's what it is).  That
still leaves figuring out how it comes into play.  But it's a start.

Drew
-- 
    Drew Sudell        asudell@acm.org         http://www.op.net/~asudell


Re: Problems with German ä, ö,

Posted by Frank Neumann <fr...@bb-data.de>.
Hi Andrew,

Andrew Sudell wrote:

> "Frank Neumann"
> >
> > Hi folks,
> >
> > I'm experiencing problems when using the German letters ä ö ü ß in
> > URL's.
> >
> > first box: Solaris 2.6, JDK 1.3.0, tomcat 3.2.1
> > tomcat was installed as binary distribution by a developer from another
> > company. When using correctly URL-encoded ä ö ü ß there is no problem.
> > The servlets work fine.
> >
> > second box: Solaris 7, JDK 1.3.1, tomcat 3.2.1
> > I compiled and installed tomcat by myself. When using correctly
> > URL-encoded ä ö ü ß they doesn't seem to get correctly to the servlets
> > and the servlets fail. The resulting URL contains a ? instead of the
> > original encoded letter.
> >
> > I assume this has something to do with localization. The locales on both
> > boxes are identical. My question is how to configure tomcat to work
> > correctly.
> >
>
> Not sure if this is it, but I'll give you something to look at.
>
> Try looking at the system property file.encoding.  As far as I can tell
> (I'm working on a similar problem at the moment):
>
>  o the loader for the jvm examines the C locale and sets file.encoding
>  o on solaris 2.6, in the "C" locale (LANG unset), the file encoding
>    is latin 1 (ISO8859-1).
>  o on solaris 2.7, in the "C" locale, the file encoding is ascii (646),
>    but in the "en_US" (LANG="en_US") the file encoding is ISO8859-1.
>
> or at least those are the results I get on the 1.2 jvm's I have loaded.
>
> If I had to guess, the story is something like a C locale implies an
> encoding, a Java locale does not (makes sense everything is UCS-16
> anyway), you can spec an encoding for Java to use when transcoding
> external multi-byte to widechar (look at InputStreamReader) but
> it defaults based on the external C locale (ie how it expects your
> files to work).
>
> Figuring out the default encoding difference will at least explain
> the difference between the boxes (assuming that's what it is).  That
> still leaves figuring out how it comes into play.  But it's a start.

you gave me the right hint. After I defined LANG=en_US in tomcat.sh all works
fine. Strange enough, I had a similar problem regarding the representation of
amounts of money on Solaris 8 with JVM 1.3.0_02. There the definition of
LC_MONETARY in tomcat.sh didn't work but an additional -Duser.language=de to
the JVM command line did it.
Thanks a lot.

Frank