You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Chris Burdess <do...@bluezoo.org> on 2005/06/16 20:15:52 UTC

Possible bug in request parameter decoding

In Tomcat 5.5.9, class org.apache.catalina.connector.Request, lines 
2307-2312, the charset used to decode request parameters is identified
as org.apache.coyote.Constants.DEFAULT_CHARACTER_ENCODING, i.e.
"ISO-8859-1".

According to

  http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars

request parameters are encoded in UTF-8. A simple form test suffices to
confirm that my user-agents (Camino and Safari) correctly encode both
POST and GET request parameters in UTF-8.

It seems this may be a long-standing bug in Tomcat, preventing the
posting of non-ASCII text from standards-compliant user-agents. I can't
find anything matching in Bugzilla though. Is there a good reason for
using Latin-1 here? 
-- 
Chris Burdess

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org


Re: Possible bug in request parameter decoding

Posted by Tim Funk <fu...@joedog.org>.
The HTTP spec is vague. It has many references to ISO8859-1. IIRC, there is a 
connector option to decode parameters as UTF-8.

-Tim

Chris Burdess wrote:

> In Tomcat 5.5.9, class org.apache.catalina.connector.Request, lines 
> 2307-2312, the charset used to decode request parameters is identified
> as org.apache.coyote.Constants.DEFAULT_CHARACTER_ENCODING, i.e.
> "ISO-8859-1".
> 
> According to
> 
>   http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars
> 
> request parameters are encoded in UTF-8. A simple form test suffices to
> confirm that my user-agents (Camino and Safari) correctly encode both
> POST and GET request parameters in UTF-8.
> 
> It seems this may be a long-standing bug in Tomcat, preventing the
> posting of non-ASCII text from standards-compliant user-agents. I can't
> find anything matching in Bugzilla though. Is there a good reason for
> using Latin-1 here? 

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org