You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Jim Coble <Ji...@duke.edu> on 2006/09/29 19:24:09 UTC

Problem with UTF-8 characters in JSP page

I have a JSP page which contains what I believe to be a UTF-8 character;
namely, octal \302\251, which I understand to be the UTF-8 coding for the
copyright symbol.  When this page is rendered in my browser, what appears
is A-circumflex, the Latin-1 character corresponding to octal \302,
followed by the copyright symbol, the Latin-1 character corresponding to
octal \251.  So, it looks as though the page is being rendered as though
the page encoding was Latin-1 rather than UTF-8.  This is despite the fact
that the response header contains "Content-Type: text/html; charset=utf-8"
and the page head contains "<meta content="text/html; charset=utf-8"
http-equiv="Content-Type"/>".

Any ideas why this is happening and how I can get the page to display
correctly?

One thing I noticed is (and I don't know if this is relevant or not, but it
seems kind of odd to me) ... When Tomcat (using Jasper 2, as far as I know)
compiles the JSP page, the resulting x_jsp.java source file contains the
following line:
   out.write("Copyright \303\202\302\251 2006");
Note the \303\202 preceding the \302\251.  As near as I can tell, the x.jsp
file does not contain the \303\202 -- it looks as though those are being
added in the compilation process.  (The x.jsp page has "Copyright \302\251
2006".  I might also add that it has page directives
'contentType="text/html; charset=utf-8"' and 'pageEncoding="utf-8"'.)

Also, in the conf/web.xml, I have tried explicitly adding
        <init-param>
            <param-name>javaEncoding</param-name>
            <param-value>UTF8</param-value>
        </init-param>
to the
    <servlet>
        <servlet-name>jsp</servlet-name>
        <servlet-class>org.apache.jasper.servlet.JspServlet</servlet-class>
    </servlet>
block, with no apparent change in behavior.  (Also tried it with
param-value set to UTF-8, including the hyphen)

I have also tried a number of other suggestions I have found in researching
this on the web, including those listed below, all to no avail ...
- added "-Djavax.servlet.request.encoding=UTF-8 -Dfile.encoding=UTF-8
-DjavaEncoding=UTF-8" to the JAVA_OPTS for Tomcat JVM startup
- added the following to conf/web.xml
    <context-param>
      <param-name>fileEncoding</param-name>
      <param-value>UTF-8</param-value>
    </context-param>
    <context-param>
      <param-name>contextDefaultEncoding</param-name>
      <param-value>UTF-8</param-value>
    </context-param>
- added a CharacterEncodingFilter I found on the web to explicitly set
      response.setContentType("text/html; charset=UTF-8");

Any suggestions will be greatly appreciated.  Thanks in advance.
--Jim

=================================
Jim Coble
Digital Projects Consultant
Perkins Library
Email: jim.coble@duke.edu
Voice: 919-660-5974  Fax: 919-668-2578
Box 90198, Duke University
Durham, NC 27708-0198
=================================



---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Problem with UTF-8 characters in JSP page

Posted by Pulkit Singhal <pu...@gmail.com>.
Hi Jim,

The very first thing I would be tempted to try the following "Copyright \ua9
2006"