You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Apache Wiki <wi...@apache.org> on 2008/06/23 20:14:19 UTC

[Tomcat Wiki] Update of "Tomcat/UTF-8" by ChristopherSchultz

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tomcat Wiki" for change notification.

The following page has been changed by ChristopherSchultz:
http://wiki.apache.org/tomcat/Tomcat/UTF-8

The comment on the change is:
Stopped clobbering client-provided character encoding; added notes.

------------------------------------------------------------------------------
  
  3.
  For translation of inputs coming back from the browser there must be a
- method that translates from the browser's ISO-8859-1 to UTF-8.  It seems to
+ method that translates from the browser's ISO-8859-1 to UTF-8.  ISO-8859-1
+ is the default character encoding for servers and browsers according to the
+ [http://www.ietf.org/rfc/rfc2616.txt HTTP specification] section 3.4.1.
- me that -1 is used in all regions as I have had people in countries such as
- Greece & Bulgaria test this and they always send input back in -1 encoding.
- The method which you will use constantly should go something like this:
  
  {{{  /**
-   * Convert ISO8859-1 format string (which is the default sent by IE
+   * Convert ISO-8859-1 format string (which is the default sent by IE
    * to the UTF-8 format that the database is in.
    */
   public String toUTF8(String isoString)
@@ -41, +40 @@

     }
     catch(UnsupportedEncodingException e)
     {
+     //  TODO: This should never happen. The UnsupportedEncodingException
+     // should be propagated instead of swallowed. This error would indicate
+     // a severe misconfiguration of the JVM.
+ 
      // As we can't translate just send back the best guess.
      System.out.println("UnsupportedEncodingException is: " +
  e.getMessage());
@@ -89, +92 @@

   public void doFilter(ServletRequest request, ServletResponse response, FilterChain next)
   throws IOException, ServletException
   {
+   // Respect the client-specified character encoding
+   // (see HTTP specification section 3.4.1)
+   if(null == request.getCharacterEncoding())
-   request.setCharacterEncoding(encoding);
+     request.setCharacterEncoding(encoding);
+ 
    next.doFilter(request, response);
   }
  
@@ -119, +126 @@

  
  The suggested solution originates from [http://people.comita.spb.ru/users/sergeya/java/ruschars.html Sergey Astakhov (all texts are in russian)] (sergeya@comita.spb.ru)
  
+ '''Important note''': Note that this filter should be as far towards the front of your filter chain as possible. If some other code calls request.getParameter (or a similar method) before this filter is invoked, then the encoding will not be set properly, and your parameters will still be decoded improperly.
+ 
  '''- TIP -'''
  
  Update the file $CATALINA_HOME/conf/server.xml for UTF-8 support by connectors.
@@ -131, +140 @@

                 disableUploadTimeout="true" 
                 '''URIEncoding="UTF-8'''"/>
  
+ Note that this changes the behavior of reading GET parameters from the request URI and will not affect POST parameters at all.
+ 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org