You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Bojan Smojver <bo...@binarix.com> on 2001/08/11 01:54:20 UTC

TC 3.3B1: Default charset: 8859_1 or iso-8859-1

Don't really know what's politically correct here, but if you attempt to
validate any page coming from Tomcat 3.3 B1 at http://validator.w3.org/,
it doesn't like it at all due to the default charset setting.

Here is the message from W3C Validator:

----------------------
A fatal error occurred when attempting to transliterate the document
charset. Either we do not support this
character encoding yet, or you have specified a non-existent character
encoding (typically a misspelling such as
"iso8859-1" for "iso-8859-1"). 

The detected charset was "8859_1".
----------------------

Had a look into lynx.cfg (as in: good old Lynx) file and they have the
W3C version in there (ie. iso-8859-1).

The HTTP headers look something like this:

----------------------
[bojan@beast bojan]$ telnet www.pivotpointaustralia.com.au 80
Trying 203.53.131.10...
Connected to www.pivotpointaustralia.com.au.
Escape character is '^]'.
GET /products/ HTTP/1.0
Host: www.pivotpointaustralia.com.au

HTTP/1.1 200 OK
Date: Fri, 10 Aug 2001 23:32:47 GMT
Server: Apache/1.3.20 (Unix) PHP/4.0.6 mod_jk mod_ssl/2.8.4
OpenSSL/0.9.6b
Set-Cookie: JSESSIONID=ta4yofspg1;Path=/
Connection: close
Content-Type: text/html;charset=8859_1
...
----------------------

My configuration is Apache 1.3.20 + mod_jk on RH Linux and I'm assuming
the setting is coming from
src/share/org/apache/tomcat/modules/server/Ajp13Packet.java, around line
308, which goes something like this:

----------------------
public static final String DEFAULT_CHAR_ENCODING = "8859_1";
----------------------

Is this a bug or W3C don't know what they're talking about?

Bojan