You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@struts.apache.org by bu...@apache.org on 2002/11/09 04:35:34 UTC

DO NOT REPLY [Bug 14404] New: - MultipartIterator does not reflect ServletRequest#setCharacterEncoding

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=14404>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=14404

MultipartIterator does not reflect ServletRequest#setCharacterEncoding

           Summary: MultipartIterator does not reflect
                    ServletRequest#setCharacterEncoding
           Product: Struts
           Version: 1.0.2 Final
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Enhancement
          Priority: Other
         Component: File Upload
        AssignedTo: struts-dev@jakarta.apache.org
        ReportedBy: yass@netjoy.ne.jp


When the MultipartIterator construct string from the form-data part of user 
submitted data, it ignores the Character Encoding setting of the request. This 
causes problem with the from submitted in the non-ISO-8859-1 encodings, 
especially double byte encodings like Shift_JIS, Big5, etc.

HTTP protocol itself does not have the way to convey the information of the 
character encoding that the request is encoded in. However, the web browser 
usually send the request in the character encoding the form was displayed in, 
so the developper, who knows what encoding the form page was written in, can 
specify the character encoding by calling request#setCharacterEncoding assuming 
the browser have sent the request in the same character encoding as the form 
was written in.

However, for Multipart Request the key-value pair in the request is parsed by 
the user, rather than the servlet engine. In struts's case, it is handled by 
the classes within the org.apache.struts.upload package. 

One key class in the package, MultipartIterator, constructs the strings in the 
form using ISO-88590-1 encoding regardless of the encoding set in the request 
object. Thus, when a multibyte character was received, it cuts one character 
into two bytes and encode each byte into a Java char(16 bit) assuming it is a 
ISO-8859-1 character. The resulting form-data strings become unreadable.

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>