You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Apache Wiki <wi...@apache.org> on 2008/06/23 20:14:19 UTC
[Tomcat Wiki] Update of "Tomcat/UTF-8" by ChristopherSchultz
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Tomcat Wiki" for change notification.
The following page has been changed by ChristopherSchultz:
http://wiki.apache.org/tomcat/Tomcat/UTF-8
The comment on the change is:
Stopped clobbering client-provided character encoding; added notes.
------------------------------------------------------------------------------
3.
For translation of inputs coming back from the browser there must be a
- method that translates from the browser's ISO-8859-1 to UTF-8. It seems to
+ method that translates from the browser's ISO-8859-1 to UTF-8. ISO-8859-1
+ is the default character encoding for servers and browsers according to the
+ [http://www.ietf.org/rfc/rfc2616.txt HTTP specification] section 3.4.1.
- me that -1 is used in all regions as I have had people in countries such as
- Greece & Bulgaria test this and they always send input back in -1 encoding.
- The method which you will use constantly should go something like this:
{{{ /**
- * Convert ISO8859-1 format string (which is the default sent by IE
+ * Convert ISO-8859-1 format string (which is the default sent by IE
* to the UTF-8 format that the database is in.
*/
public String toUTF8(String isoString)
@@ -41, +40 @@
}
catch(UnsupportedEncodingException e)
{
+ // TODO: This should never happen. The UnsupportedEncodingException
+ // should be propagated instead of swallowed. This error would indicate
+ // a severe misconfiguration of the JVM.
+
// As we can't translate just send back the best guess.
System.out.println("UnsupportedEncodingException is: " +
e.getMessage());
@@ -89, +92 @@
public void doFilter(ServletRequest request, ServletResponse response, FilterChain next)
throws IOException, ServletException
{
+ // Respect the client-specified character encoding
+ // (see HTTP specification section 3.4.1)
+ if(null == request.getCharacterEncoding())
- request.setCharacterEncoding(encoding);
+ request.setCharacterEncoding(encoding);
+
next.doFilter(request, response);
}
@@ -119, +126 @@
The suggested solution originates from [http://people.comita.spb.ru/users/sergeya/java/ruschars.html Sergey Astakhov (all texts are in russian)] (sergeya@comita.spb.ru)
+ '''Important note''': Note that this filter should be as far towards the front of your filter chain as possible. If some other code calls request.getParameter (or a similar method) before this filter is invoked, then the encoding will not be set properly, and your parameters will still be decoded improperly.
+
'''- TIP -'''
Update the file $CATALINA_HOME/conf/server.xml for UTF-8 support by connectors.
@@ -131, +140 @@
disableUploadTimeout="true"
'''URIEncoding="UTF-8'''"/>
+ Note that this changes the behavior of reading GET parameters from the request URI and will not affect POST parameters at all.
+
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org