You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2003/08/22 20:09:11 UTC

DO NOT REPLY [Bug 22666] New: - Entered non us-ascii symbols into the form appead wrong in JSP

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=22666>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=22666

Entered non us-ascii symbols into the form appead wrong in JSP

           Summary: Entered non us-ascii symbols into the form appead wrong
                    in JSP
           Product: Tomcat 4
           Version: 4.0.6 Final
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: Major
          Priority: Other
         Component: Servlet & JSP API
        AssignedTo: tomcat-dev@jakarta.apache.org
        ReportedBy: davidovsv@yandex.ru


My HTML have charset UTF-8.
I have a simple form with one input text field.
When I post into this text field non us-ascii symbols, like &#x00F6; - Latin 
Small Letter O With Diaresis (รถ), it comes to JSP paramaters in strange state.

I've following code in JSP:

request.setCharacterEncoding("UTF-8");
String [] texts = request.getParameterValues ("text");

As my form has only one text field "text", I should get String [1] array here. 
But I've got String [3] with strange consistent. 

I've add following code for look into the data:

    String [] texts = request.getParameterValues("text");
    for (int i = 0; i < texts.length; i++) {
      String value = texts[i];
      byte [] bytes = value.getBytes("UTF-8");
      for (int j = 0; j < bytes.length; j++) {
        byte aByte = bytes[j];
        System.out.print(aByte + ", ");
      }
      System.out.println("");
    }

And I got following:
-61, -125, -62, -125, -61, -126, -62, -125, -61, -125, -62, -126, -61, -126, -
62, -74, 

-61, -125, -62, -125, -61, -126, -62, -74, 

-61, -125, -62, -74, 

So each of three Strings contain wrong set of characters different from 
&#x00F6;

What's wrong with it?