You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2004/07/03 13:19:44 UTC

DO NOT REPLY [Bug 29900] New: - request params in utf-8 corrupted

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=29900>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29900

request params in utf-8 corrupted

           Summary: request params in utf-8 corrupted
           Product: Tomcat 5
           Version: 5.0.25
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: Blocker
          Priority: Other
         Component: Unknown
        AssignedTo: tomcat-dev@jakarta.apache.org
        ReportedBy: ashert@huji.013.net.il


a parameter sent in request in utf-8 encoding arrives as if it would be sent in
another encoding (iso-xxx, windows-xxx or whatever). works fine with tomcat 4.0.
doesn't work on tomcat 5.0.xx

a jsp code example:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
 
<form act="/tests/utf.jsp" method=post>
<input type=text name=source >
<input type=submit>
<form>
<p>
 
<%
request.setCharacterEncoding("UTF-8");

if(request.getParameter("source")!=null)
{ 
  out.println(request.getParameter("source").length()+"<p>");
 
  out.println(request.getParameter("source"));
 
  StringBuffer sb = new StringBuffer();
  for(int i=0; i<request.getParameter("source").length(); i++)
  {
    if(request.getParameter("source").charAt(i) == '&')
      sb.append("&");
    else
      sb.append(request.getParameter("source").charAt(i));
 
  }
  out.println("<p>"+ sb.toString());
}
%>
 
</body>
</html>

as you see, this code block gets a utf-8 encoded parameter from
a request, outputs its length, the parameter itself, and its html
utf-8 codes.
to test it i send a hebrew letter ALEF. on tomcat 4.xx everything
works perfect and i get the following response:

7
א
&amp;#1488;

(in case you don't see it here, it's 7 , alef as utf-8 code and alef's utf-8
code parsed to be visible in browser)

with tomcat 5.0.xx i get:

1
?
?

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org