You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by André Warnier <aw...@ice-sa.com> on 2009/04/03 17:38:37 UTC

Tomcat and UTF-8, or should I say charsets ?

Hi.

I just experienced something which somehow contains a delicious piece of 
irony for those who remember the numerous discussions on this list with 
topics related to the proper encoding of URLs, POST submission 
parameters etc..

Having forgotten my user-id and password for the Tomcat (FAQ) Wiki, I 
just re-registered. For that I scrupulously followed the instructions on 
the Wiki login page, so I chose a name of the form "FirstnameLastname".
As some people here know, my first name is André.

That worked fine, and I could after that login.
Then, in order not to forget this again, I logged out, and used the Wiki 
login page facility to ask for an email reminder of my id/password.
I duly received that email.
But, guess what, my name in that email appears as

Name: AndréWarnier

I guess now someone should tell us that the Wiki has nothing to do with 
Tomcat.
:-)


André


------------

I write this considerably below in order not to ruin the above.
Upon further investigaton, it would seem that the confirmation email 
which I received, contains a MIME header which says :

Content-Type: text/plain; charset="us-ascii"

However, as can be seen above, the content is really UTF-8.

I must also say that on the Wiki page, after logging in, my name appears 
correctly encoded and accentuated, and that the corresponding link is 
properly represented as
<a class="nonexistent" href="/tomcat/Andr%c3%a9Warnier">AndréWarnier</a>
and that this page comes with a HTTP header :
Content-type: text/html;charset=utf-8
and that the html page itself contains such a declaration :
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">


So it would seem that the problem is not really in the Wiki itself, 
which seems to do pretty much everything according to the specs. But 
somehow the communication between the Wiki and the email system on 
a***.apache.org does not preserve the original encoding data.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Tomcat and UTF-8, or should I say charsets ?

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

André,

On 4/3/2009 11:38 AM, André Warnier wrote:
> But, guess what, my name in that email appears as
> 
> Name: AndréWarnier

I suspect this is careless use of whatever emailing library is being
used. MoinMoin appears to use Python, a language and standard library
with which I am completely unfamiliar.

I suspect that someone it doing a simple:

mail("Subject", "body")

and calling it a day, when they should either be pre-encoding the
content (not likely), or specifying a character encoding or /something/
when calling the email function.

Oddly enough, MoinMoin has a German domain name
(http://moinmoin.wikiwikiweb.de/) so I would have figured they'd have
their ä's and ß's straight. Perhaps not.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAknWhdsACgkQ9CaO5/Lv0PCRqACgp2AkoEROGjzNUPn+IPfxveaJ
e8cAoKk+FgI8BeV/KcJjaa4Jv5aA8nOK
=U/Nk
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org