You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@struts.apache.org by Paul Barry <pa...@nyu.edu> on 2003/10/28 16:06:55 UTC

Problem with UTF-8 characters in a mutlipart/form-data encoded form

I am using Struts 1.1 in an application that needs to support the UTF-8 character set.  I am using Resin 2.1.10 with 
character-encoding="UTF-8", and on most of my forms this seems to work just fine.  I am having problems with forms that 
have to use the multipart/form-data enctype for handling uploading files.  If I print out the value of a text element in 
an html:form where the enctype is not set at all (which ends up using application/x-www-form-urlencoded), using UTF-8 
characters works fine.  This is what I get:

INFO - test.TestAction - The value is: ä

Here is what the actual HTTP request that gets sent to the server looks like:

--- Start HTTP Request -----------------------------------------------------
POST /testForm.do HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */*
Referer: http://pbdesktop/test.do
Accept-Language: en-us
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
Host: pbdesktop
Content-Length: 11
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd

test=%C3%AD
--- End HTTP Request ------------------------------------------------------

But if I modify my html:form to use enctype="multipart/form-data", I get this:

INFO - test.TestAction - The value is: A¤

And the HTTP request looks like this:

--- Start HTTP Request -----------------------------------------------------
POST /testForm.do HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */*
Referer: http://pbdesktop/test.do
Accept-Language: en-us
Content-Type: multipart/form-data; boundary=---------------------------7d319628600e4
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
Host: pbdesktop
Content-Length: 141
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd

-----------------------------7d319628600e4
Content-Disposition: form-data; name="test"

í
-----------------------------7d319628600e4-
--- End HTTP Request ------------------------------------------------------

It looks as if the character is already messed up before it even gets to the servlet container.  There are messages in 
the mailing list archive that discuss this problem, but I didn't see a solution.  What is the best way to handle UTF-8 
characters in a multipart/form-data encoded form?

Here is the code that I am testing with:

/test/test.jsp:
<%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
<%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>

<html:html>
   <body>
     <html:form action="testForm.do" enctype="multipart/form-data">
       <html:text property="test" />
       <html:submit />
     </html:form>
   </body>
</html:html>

Relavent parts of struts-config.xml:
<struts-config>

   <form-beans>
     <form-bean name="testForm" type="test.TestActionForm" />
   </form-beans>

   <action-mappings>
     <action path="/test" type="org.apache.struts.actions.ForwardAction" parameter="/test/test.jsp" />
     <action path="/testForm" type="test.TestAction" name="testForm" input="/test.do" scope="request" />
   </action-mappings>

   <controller contentType="text/html;charset=UTF-8" />

<struts-config/>

test.TestAction:
package test;

import javax.servlet.http.*;
import org.apache.commons.logging.*;
import org.apache.struts.action.*;

public class TestAction extends Action {
	private static final Log log = LogFactory.getLog(TestAction.class);
	
	public ActionForward execute(
			ActionMapping mapping,
			ActionForm pform,
			HttpServletRequest request,
			HttpServletResponse response)
			throws Exception {
		TestActionForm form = (TestActionForm)pform;
		log.info("The value is: "+form.getTest());
		return null;
	}
}

test.TestActionForm:
package test;

import org.apache.struts.action.ActionForm;

public class TestActionForm extends ActionForm {
	private String test;
	public String getTest() { return test; 	}
	public void setTest(String string) { test = string; }
}


---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org


Re: Problem with UTF-8 characters in a mutlipart/form-data encoded form

Posted by Jason Lea <ja...@kumachan.net.nz>.
Paul Barry wrote:

>By using a <meta> element, do you mean this:
>
><meta http-equiv="Content-Type" content="test/html; charset=utf-8">
>
>That doesn't seem to work when the form is multipart/form-data, because the Content-Type header still just has 
>multipart/form-data.  The problem seems to be that when I do a request.getCharacterEncoding(), I get null.  Is that 
>normal?  I would think I should at least get the default character encoding for the webapp. I am using Resin 2.1.10. 
>This might be an issue for me to report to them.
>
>This definately is what is causing my problem, because if I look at the code in 
>org.apache.struts.upload.CommonsMultipartRequestHandler.addTextParameter(), this is the first thing is does:
>
>         try {
>             value = item.getString(request.getCharacterEncoding());
>         } catch (Exception e) {
>             value = item.getString();
>         }
>
>Since request.getCharacterEncoding() is null, I assume an Exception is being throw an caught (a log.warn() might be nice 
>there) and then I am get getting the string without decoding it from UTF-8.
>
>If I manually set the characterEncoding to UTF-8 before this code executes (in processMultipart() in the 
>requestProcessor for example), then everything works fine.
>
>So I guess my question is should I be expecting request.getCharacterEncoding() to return null or is there a bug in my 
>app server?
>  
>
Nope, it is not a bug. The browser hasn't set the encoding, so the 
method returns null to indicate this.

Here is the relevant section from Java™ Servlet Specification Version 
2.3 (servlet-2_3-fcs-spec.pdf)

"SRV.4.9 Request data encoding
Currently, many browsers do not send a char encoding qualifier with the 
Content-Type header, leaving open the determination of the character 
encoding for reading HTTP requests. The default encoding of a request 
the container uses to create the request reader and parse POST data must 
be “ISO-8859-1”, if none has been specified by the client request. 
However, in order to indicate to the developer in this case the failure 
of the client to send a character encoding, the container returns null 
from the getCharacterEncoding method.

If the client hasn’t set character encoding and the request data is 
encoded with a different encoding than the default as described above, 
breakage can occur. To remedy this situation, a new method 
setCharacterEncoding(String enc) has been added to the ServletRequest 
interface. Developers can override the character encoding supplied by 
the container by calling this method. It must be called prior to parsing 
any post data or reading any input from the request. Calling this method 
once data has been read will not affect the encoding."


>
>
>
>
>Martin Cooper wrote:
>
>  
>
>>In Struts 1.1, the default file upload mechanism *is* Commons FileUpload.
>>;-)
>>
>>It seems that you may have omitted to tell the browser explicitly that your
>>pages are in UTF-8. For some reason that I've never fully understood, that
>>causes the browser to use UTF-8 when it submits subsequent requests from
>>that page. Make sure that you use a <meta> element in your <head> to specify
>>UTF-8.
>>
>>--
>>Martin Cooper
>>
>>
>>"Paul Barry" <pa...@nyu.edu> wrote in message
>>news:3F9FD8C9.30301@nyu.edu...
>>
>>    
>>
>>>I think you are correct.  When I was looking at the packets and seeing two
>>>      
>>>
>>characters, it is actually the characters
>>
>>    
>>
>>>that are equal to the 2 bytes that make up the single UTF-8 character.  I
>>>      
>>>
>>thought the browser was somehow not correctly
>>
>>    
>>
>>>encoding my data, because it was turning 1 character into 2 characters,
>>>      
>>>
>>but actually it is UTF-8 encoding my character
>>
>>    
>>
>>>correctly.  So I think if I use something to read the data and convert it
>>>      
>>>
>>from UTF-8 to Unicode, I will get the correct
>>
>>    
>>
>>>data on the server.
>>>
>>>So from reading the documentation about FileUpload, that seems to be the
>>>      
>>>
>>way to go, but now my question is how to
>>
>>    
>>
>>>integrate FileUpload with struts?  My thought would be to call a method to
>>>      
>>>
>>populate an ActionForm in the beginning of my
>>
>>    
>>
>>>action, and then use that ActionForm instead of the one I get from the
>>>      
>>>
>>requestProcessor.  So like this:
>>
>>    
>>
>>>     public ActionForward execute(
>>>             ActionMapping mapping,
>>>             ActionForm pform,
>>>             HttpServletRequest request,
>>>             HttpServletResponse response)
>>>             throws Exception {
>>>         TestActionForm form = getFormUsingFileUpload(request);
>>>         log.info("The value is: "+form.getTest());
>>>         return null;
>>>    }
>>>
>>>Is this how others have used Jakarta Commons FileUpload with Struts, or is
>>>      
>>>
>>there a better way?
>>
>>    
>>
>>>Jason Lea wrote:
>>>
>>>
>>>      
>>>
>>>>>From what I can see there Resin is expecting UTF-8 for any paramters
>>>>passed to it, and decoding it correctly.  However multipart/form-data is
>>>>treated differently as the data is not passed as normal parameters so
>>>>the request.getParameter() cannot be used here (and servlet filters that
>>>>set the request encoding won't help either).
>>>>
>>>>You normally have to use something like the FileUpload component to
>>>>extract form fields and files from the request.  This component is not
>>>>going to know about the character encoding you have given to resin, so
>>>>it will use the default which is probably US-ASCII.  With UTF-8 a single
>>>>character can be rendered as 1, 2 or 3 bytes.  When decoding a UTF-8
>>>>string the decoder will combine the 1,2 or 3 byte combinations into 1
>>>>Unicode character.  When UTF-8 is not used to decode the string you will
>>>>see the individual bytes.
>>>>
>>>>Looking here (the jakarta apache FileUpload component):
>>>>
>>>>        
>>>>
>>http://jakarta.apache.org/commons/fileupload/apidocs/org/apache/commons/fileupload/FileUploadBase.html
>>
>>    
>>
>>>>They have a setHeaderEncoding() method which I assume will deal with
>>>>this problem (I haven't tested this so I don't know).  Are you using a
>>>>file upload component?
>>>>
>>>>
>>>>Paul Barry wrote:
>>>>
>>>>
>>>>        
>>>>
>>>>>I am using Struts 1.1 in an application that needs to support the
>>>>>UTF-8 character set.  I am using Resin 2.1.10 with
>>>>>character-encoding="UTF-8", and on most of my forms this seems to work
>>>>>just fine.  I am having problems with forms that have to use the
>>>>>multipart/form-data enctype for handling uploading files.  If I print
>>>>>out the value of a text element in an html:form where the enctype is
>>>>>not set at all (which ends up using
>>>>>application/x-www-form-urlencoded), using UTF-8 characters works
>>>>>fine.  This is what I get:
>>>>>
>>>>>INFO - test.TestAction - The value is: ä
>>>>>
>>>>>Here is what the actual HTTP request that gets sent to the server
>>>>>looks like:
>>>>>
>>>>>--- Start HTTP Request
>>>>>-----------------------------------------------------
>>>>>POST /testForm.do HTTP/1.1
>>>>>Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
>>>>>application/x-shockwave-flash, */*
>>>>>Referer: http://pbdesktop/test.do
>>>>>Accept-Language: en-us
>>>>>Content-Type: application/x-www-form-urlencoded
>>>>>Accept-Encoding: gzip, deflate
>>>>>User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
>>>>>Host: pbdesktop
>>>>>Content-Length: 11
>>>>>Connection: Keep-Alive
>>>>>Cache-Control: no-cache
>>>>>Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
>>>>>
>>>>>test=%C3%AD
>>>>>--- End HTTP Request
>>>>>------------------------------------------------------
>>>>>
>>>>>But if I modify my html:form to use enctype="multipart/form-data", I
>>>>>get this:
>>>>>
>>>>>INFO - test.TestAction - The value is: A¤
>>>>>
>>>>>And the HTTP request looks like this:
>>>>>
>>>>>--- Start HTTP Request
>>>>>-----------------------------------------------------
>>>>>POST /testForm.do HTTP/1.1
>>>>>Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
>>>>>application/x-shockwave-flash, */*
>>>>>Referer: http://pbdesktop/test.do
>>>>>Accept-Language: en-us
>>>>>Content-Type: multipart/form-data;
>>>>>boundary=---------------------------7d319628600e4
>>>>>Accept-Encoding: gzip, deflate
>>>>>User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
>>>>>Host: pbdesktop
>>>>>Content-Length: 141
>>>>>Connection: Keep-Alive
>>>>>Cache-Control: no-cache
>>>>>Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
>>>>>
>>>>>-----------------------------7d319628600e4
>>>>>Content-Disposition: form-data; name="test"
>>>>>
>>>>>í
>>>>>-----------------------------7d319628600e4-
>>>>>--- End HTTP Request
>>>>>------------------------------------------------------
>>>>>
>>>>>It looks as if the character is already messed up before it even gets
>>>>>to the servlet container.  There are messages in the mailing list
>>>>>archive that discuss this problem, but I didn't see a solution.  What
>>>>>is the best way to handle UTF-8 characters in a multipart/form-data
>>>>>encoded form?
>>>>>
>>>>>Here is the code that I am testing with:
>>>>>
>>>>>/test/test.jsp:
>>>>><%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
>>>>><%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>
>>>>>
>>>>><html:html>
>>>>> <body>
>>>>>   <html:form action="testForm.do" enctype="multipart/form-data">
>>>>>     <html:text property="test" />
>>>>>     <html:submit />
>>>>>   </html:form>
>>>>> </body>
>>>>></html:html>
>>>>>
>>>>>Relavent parts of struts-config.xml:
>>>>><struts-config>
>>>>>
>>>>> <form-beans>
>>>>>   <form-bean name="testForm" type="test.TestActionForm" />
>>>>> </form-beans>
>>>>>
>>>>> <action-mappings>
>>>>>   <action path="/test"
>>>>>type="org.apache.struts.actions.ForwardAction"
>>>>>parameter="/test/test.jsp" />
>>>>>   <action path="/testForm" type="test.TestAction" name="testForm"
>>>>>input="/test.do" scope="request" />
>>>>> </action-mappings>
>>>>>
>>>>> <controller contentType="text/html;charset=UTF-8" />
>>>>>
>>>>><struts-config/>
>>>>>
>>>>>test.TestAction:
>>>>>package test;
>>>>>
>>>>>import javax.servlet.http.*;
>>>>>import org.apache.commons.logging.*;
>>>>>import org.apache.struts.action.*;
>>>>>
>>>>>public class TestAction extends Action {
>>>>>   private static final Log log = LogFactory.getLog(TestAction.class);
>>>>>
>>>>>   public ActionForward execute(
>>>>>           ActionMapping mapping,
>>>>>           ActionForm pform,
>>>>>           HttpServletRequest request,
>>>>>           HttpServletResponse response)
>>>>>           throws Exception {
>>>>>       TestActionForm form = (TestActionForm)pform;
>>>>>       log.info("The value is: "+form.getTest());
>>>>>       return null;
>>>>>   }
>>>>>}
>>>>>
>>>>>test.TestActionForm:
>>>>>package test;
>>>>>
>>>>>import org.apache.struts.action.ActionForm;
>>>>>
>>>>>public class TestActionForm extends ActionForm {
>>>>>   private String test;
>>>>>   public String getTest() { return test;     }
>>>>>   public void setTest(String string) { test = string; }
>>>>>}
>>>>>
>>>>>
>>>>>---------------------------------------------------------------------
>>>>>To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
>>>>>For additional commands, e-mail: struts-user-help@jakarta.apache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>        
>>>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: struts-user-help@jakarta.apache.org
>>
>>    
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: struts-user-help@jakarta.apache.org
>
>
>  
>


-- 
Jason Lea




---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org


Re: Problem with UTF-8 characters in a mutlipart/form-data encoded form

Posted by Paul Barry <pa...@nyu.edu>.
By using a <meta> element, do you mean this:

<meta http-equiv="Content-Type" content="test/html; charset=utf-8">

That doesn't seem to work when the form is multipart/form-data, because the Content-Type header still just has 
multipart/form-data.  The problem seems to be that when I do a request.getCharacterEncoding(), I get null.  Is that 
normal?  I would think I should at least get the default character encoding for the webapp. I am using Resin 2.1.10. 
This might be an issue for me to report to them.

This definately is what is causing my problem, because if I look at the code in 
org.apache.struts.upload.CommonsMultipartRequestHandler.addTextParameter(), this is the first thing is does:

         try {
             value = item.getString(request.getCharacterEncoding());
         } catch (Exception e) {
             value = item.getString();
         }

Since request.getCharacterEncoding() is null, I assume an Exception is being throw an caught (a log.warn() might be nice 
there) and then I am get getting the string without decoding it from UTF-8.

If I manually set the characterEncoding to UTF-8 before this code executes (in processMultipart() in the 
requestProcessor for example), then everything works fine.

So I guess my question is should I be expecting request.getCharacterEncoding() to return null or is there a bug in my 
app server?





Martin Cooper wrote:

> In Struts 1.1, the default file upload mechanism *is* Commons FileUpload.
> ;-)
> 
> It seems that you may have omitted to tell the browser explicitly that your
> pages are in UTF-8. For some reason that I've never fully understood, that
> causes the browser to use UTF-8 when it submits subsequent requests from
> that page. Make sure that you use a <meta> element in your <head> to specify
> UTF-8.
> 
> --
> Martin Cooper
> 
> 
> "Paul Barry" <pa...@nyu.edu> wrote in message
> news:3F9FD8C9.30301@nyu.edu...
> 
>>I think you are correct.  When I was looking at the packets and seeing two
> 
> characters, it is actually the characters
> 
>>that are equal to the 2 bytes that make up the single UTF-8 character.  I
> 
> thought the browser was somehow not correctly
> 
>>encoding my data, because it was turning 1 character into 2 characters,
> 
> but actually it is UTF-8 encoding my character
> 
>>correctly.  So I think if I use something to read the data and convert it
> 
> from UTF-8 to Unicode, I will get the correct
> 
>>data on the server.
>>
>>So from reading the documentation about FileUpload, that seems to be the
> 
> way to go, but now my question is how to
> 
>>integrate FileUpload with struts?  My thought would be to call a method to
> 
> populate an ActionForm in the beginning of my
> 
>>action, and then use that ActionForm instead of the one I get from the
> 
> requestProcessor.  So like this:
> 
>>      public ActionForward execute(
>>              ActionMapping mapping,
>>              ActionForm pform,
>>              HttpServletRequest request,
>>              HttpServletResponse response)
>>              throws Exception {
>>          TestActionForm form = getFormUsingFileUpload(request);
>>          log.info("The value is: "+form.getTest());
>>          return null;
>>     }
>>
>>Is this how others have used Jakarta Commons FileUpload with Struts, or is
> 
> there a better way?
> 
>>
>>Jason Lea wrote:
>>
>>
>>> From what I can see there Resin is expecting UTF-8 for any paramters
>>>passed to it, and decoding it correctly.  However multipart/form-data is
>>>treated differently as the data is not passed as normal parameters so
>>>the request.getParameter() cannot be used here (and servlet filters that
>>>set the request encoding won't help either).
>>>
>>>You normally have to use something like the FileUpload component to
>>>extract form fields and files from the request.  This component is not
>>>going to know about the character encoding you have given to resin, so
>>>it will use the default which is probably US-ASCII.  With UTF-8 a single
>>>character can be rendered as 1, 2 or 3 bytes.  When decoding a UTF-8
>>>string the decoder will combine the 1,2 or 3 byte combinations into 1
>>>Unicode character.  When UTF-8 is not used to decode the string you will
>>>see the individual bytes.
>>>
>>>Looking here (the jakarta apache FileUpload component):
>>>
> 
> http://jakarta.apache.org/commons/fileupload/apidocs/org/apache/commons/fileupload/FileUploadBase.html
> 
>>>
>>>They have a setHeaderEncoding() method which I assume will deal with
>>>this problem (I haven't tested this so I don't know).  Are you using a
>>>file upload component?
>>>
>>>
>>>Paul Barry wrote:
>>>
>>>
>>>>I am using Struts 1.1 in an application that needs to support the
>>>>UTF-8 character set.  I am using Resin 2.1.10 with
>>>>character-encoding="UTF-8", and on most of my forms this seems to work
>>>>just fine.  I am having problems with forms that have to use the
>>>>multipart/form-data enctype for handling uploading files.  If I print
>>>>out the value of a text element in an html:form where the enctype is
>>>>not set at all (which ends up using
>>>>application/x-www-form-urlencoded), using UTF-8 characters works
>>>>fine.  This is what I get:
>>>>
>>>>INFO - test.TestAction - The value is: ä
>>>>
>>>>Here is what the actual HTTP request that gets sent to the server
>>>>looks like:
>>>>
>>>>--- Start HTTP Request
>>>>-----------------------------------------------------
>>>>POST /testForm.do HTTP/1.1
>>>>Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
>>>>application/x-shockwave-flash, */*
>>>>Referer: http://pbdesktop/test.do
>>>>Accept-Language: en-us
>>>>Content-Type: application/x-www-form-urlencoded
>>>>Accept-Encoding: gzip, deflate
>>>>User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
>>>>Host: pbdesktop
>>>>Content-Length: 11
>>>>Connection: Keep-Alive
>>>>Cache-Control: no-cache
>>>>Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
>>>>
>>>>test=%C3%AD
>>>>--- End HTTP Request
>>>>------------------------------------------------------
>>>>
>>>>But if I modify my html:form to use enctype="multipart/form-data", I
>>>>get this:
>>>>
>>>>INFO - test.TestAction - The value is: A¤
>>>>
>>>>And the HTTP request looks like this:
>>>>
>>>>--- Start HTTP Request
>>>>-----------------------------------------------------
>>>>POST /testForm.do HTTP/1.1
>>>>Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
>>>>application/x-shockwave-flash, */*
>>>>Referer: http://pbdesktop/test.do
>>>>Accept-Language: en-us
>>>>Content-Type: multipart/form-data;
>>>>boundary=---------------------------7d319628600e4
>>>>Accept-Encoding: gzip, deflate
>>>>User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
>>>>Host: pbdesktop
>>>>Content-Length: 141
>>>>Connection: Keep-Alive
>>>>Cache-Control: no-cache
>>>>Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
>>>>
>>>>-----------------------------7d319628600e4
>>>>Content-Disposition: form-data; name="test"
>>>>
>>>>í
>>>>-----------------------------7d319628600e4-
>>>>--- End HTTP Request
>>>>------------------------------------------------------
>>>>
>>>>It looks as if the character is already messed up before it even gets
>>>>to the servlet container.  There are messages in the mailing list
>>>>archive that discuss this problem, but I didn't see a solution.  What
>>>>is the best way to handle UTF-8 characters in a multipart/form-data
>>>>encoded form?
>>>>
>>>>Here is the code that I am testing with:
>>>>
>>>>/test/test.jsp:
>>>><%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
>>>><%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>
>>>>
>>>><html:html>
>>>>  <body>
>>>>    <html:form action="testForm.do" enctype="multipart/form-data">
>>>>      <html:text property="test" />
>>>>      <html:submit />
>>>>    </html:form>
>>>>  </body>
>>>></html:html>
>>>>
>>>>Relavent parts of struts-config.xml:
>>>><struts-config>
>>>>
>>>>  <form-beans>
>>>>    <form-bean name="testForm" type="test.TestActionForm" />
>>>>  </form-beans>
>>>>
>>>>  <action-mappings>
>>>>    <action path="/test"
>>>>type="org.apache.struts.actions.ForwardAction"
>>>>parameter="/test/test.jsp" />
>>>>    <action path="/testForm" type="test.TestAction" name="testForm"
>>>>input="/test.do" scope="request" />
>>>>  </action-mappings>
>>>>
>>>>  <controller contentType="text/html;charset=UTF-8" />
>>>>
>>>><struts-config/>
>>>>
>>>>test.TestAction:
>>>>package test;
>>>>
>>>>import javax.servlet.http.*;
>>>>import org.apache.commons.logging.*;
>>>>import org.apache.struts.action.*;
>>>>
>>>>public class TestAction extends Action {
>>>>    private static final Log log = LogFactory.getLog(TestAction.class);
>>>>
>>>>    public ActionForward execute(
>>>>            ActionMapping mapping,
>>>>            ActionForm pform,
>>>>            HttpServletRequest request,
>>>>            HttpServletResponse response)
>>>>            throws Exception {
>>>>        TestActionForm form = (TestActionForm)pform;
>>>>        log.info("The value is: "+form.getTest());
>>>>        return null;
>>>>    }
>>>>}
>>>>
>>>>test.TestActionForm:
>>>>package test;
>>>>
>>>>import org.apache.struts.action.ActionForm;
>>>>
>>>>public class TestActionForm extends ActionForm {
>>>>    private String test;
>>>>    public String getTest() { return test;     }
>>>>    public void setTest(String string) { test = string; }
>>>>}
>>>>
>>>>
>>>>---------------------------------------------------------------------
>>>>To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
>>>>For additional commands, e-mail: struts-user-help@jakarta.apache.org
>>>>
>>>>
>>>>
>>>>
>>>
>>>
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: struts-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org


Re: Problem with UTF-8 characters in a mutlipart/form-data encoded form

Posted by Martin Cooper <ma...@apache.org>.
In Struts 1.1, the default file upload mechanism *is* Commons FileUpload.
;-)

It seems that you may have omitted to tell the browser explicitly that your
pages are in UTF-8. For some reason that I've never fully understood, that
causes the browser to use UTF-8 when it submits subsequent requests from
that page. Make sure that you use a <meta> element in your <head> to specify
UTF-8.

--
Martin Cooper


"Paul Barry" <pa...@nyu.edu> wrote in message
news:3F9FD8C9.30301@nyu.edu...
> I think you are correct.  When I was looking at the packets and seeing two
characters, it is actually the characters
> that are equal to the 2 bytes that make up the single UTF-8 character.  I
thought the browser was somehow not correctly
> encoding my data, because it was turning 1 character into 2 characters,
but actually it is UTF-8 encoding my character
> correctly.  So I think if I use something to read the data and convert it
from UTF-8 to Unicode, I will get the correct
> data on the server.
>
> So from reading the documentation about FileUpload, that seems to be the
way to go, but now my question is how to
> integrate FileUpload with struts?  My thought would be to call a method to
populate an ActionForm in the beginning of my
> action, and then use that ActionForm instead of the one I get from the
requestProcessor.  So like this:
>
>       public ActionForward execute(
>               ActionMapping mapping,
>               ActionForm pform,
>               HttpServletRequest request,
>               HttpServletResponse response)
>               throws Exception {
>           TestActionForm form = getFormUsingFileUpload(request);
>           log.info("The value is: "+form.getTest());
>           return null;
>      }
>
> Is this how others have used Jakarta Commons FileUpload with Struts, or is
there a better way?
>
>
> Jason Lea wrote:
>
> >  From what I can see there Resin is expecting UTF-8 for any paramters
> > passed to it, and decoding it correctly.  However multipart/form-data is
> > treated differently as the data is not passed as normal parameters so
> > the request.getParameter() cannot be used here (and servlet filters that
> > set the request encoding won't help either).
> >
> > You normally have to use something like the FileUpload component to
> > extract form fields and files from the request.  This component is not
> > going to know about the character encoding you have given to resin, so
> > it will use the default which is probably US-ASCII.  With UTF-8 a single
> > character can be rendered as 1, 2 or 3 bytes.  When decoding a UTF-8
> > string the decoder will combine the 1,2 or 3 byte combinations into 1
> > Unicode character.  When UTF-8 is not used to decode the string you will
> > see the individual bytes.
> >
> > Looking here (the jakarta apache FileUpload component):
> >
http://jakarta.apache.org/commons/fileupload/apidocs/org/apache/commons/fileupload/FileUploadBase.html
> >
> >
> > They have a setHeaderEncoding() method which I assume will deal with
> > this problem (I haven't tested this so I don't know).  Are you using a
> > file upload component?
> >
> >
> > Paul Barry wrote:
> >
> >> I am using Struts 1.1 in an application that needs to support the
> >> UTF-8 character set.  I am using Resin 2.1.10 with
> >> character-encoding="UTF-8", and on most of my forms this seems to work
> >> just fine.  I am having problems with forms that have to use the
> >> multipart/form-data enctype for handling uploading files.  If I print
> >> out the value of a text element in an html:form where the enctype is
> >> not set at all (which ends up using
> >> application/x-www-form-urlencoded), using UTF-8 characters works
> >> fine.  This is what I get:
> >>
> >> INFO - test.TestAction - The value is: �
> >>
> >> Here is what the actual HTTP request that gets sent to the server
> >> looks like:
> >>
> >> --- Start HTTP Request
> >> -----------------------------------------------------
> >> POST /testForm.do HTTP/1.1
> >> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
> >> application/x-shockwave-flash, */*
> >> Referer: http://pbdesktop/test.do
> >> Accept-Language: en-us
> >> Content-Type: application/x-www-form-urlencoded
> >> Accept-Encoding: gzip, deflate
> >> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
> >> Host: pbdesktop
> >> Content-Length: 11
> >> Connection: Keep-Alive
> >> Cache-Control: no-cache
> >> Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
> >>
> >> test=%C3%AD
> >> --- End HTTP Request
> >> ------------------------------------------------------
> >>
> >> But if I modify my html:form to use enctype="multipart/form-data", I
> >> get this:
> >>
> >> INFO - test.TestAction - The value is: A�
> >>
> >> And the HTTP request looks like this:
> >>
> >> --- Start HTTP Request
> >> -----------------------------------------------------
> >> POST /testForm.do HTTP/1.1
> >> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
> >> application/x-shockwave-flash, */*
> >> Referer: http://pbdesktop/test.do
> >> Accept-Language: en-us
> >> Content-Type: multipart/form-data;
> >> boundary=---------------------------7d319628600e4
> >> Accept-Encoding: gzip, deflate
> >> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
> >> Host: pbdesktop
> >> Content-Length: 141
> >> Connection: Keep-Alive
> >> Cache-Control: no-cache
> >> Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
> >>
> >> -----------------------------7d319628600e4
> >> Content-Disposition: form-data; name="test"
> >>
> >> í
> >> -----------------------------7d319628600e4-
> >> --- End HTTP Request
> >> ------------------------------------------------------
> >>
> >> It looks as if the character is already messed up before it even gets
> >> to the servlet container.  There are messages in the mailing list
> >> archive that discuss this problem, but I didn't see a solution.  What
> >> is the best way to handle UTF-8 characters in a multipart/form-data
> >> encoded form?
> >>
> >> Here is the code that I am testing with:
> >>
> >> /test/test.jsp:
> >> <%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
> >> <%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>
> >>
> >> <html:html>
> >>   <body>
> >>     <html:form action="testForm.do" enctype="multipart/form-data">
> >>       <html:text property="test" />
> >>       <html:submit />
> >>     </html:form>
> >>   </body>
> >> </html:html>
> >>
> >> Relavent parts of struts-config.xml:
> >> <struts-config>
> >>
> >>   <form-beans>
> >>     <form-bean name="testForm" type="test.TestActionForm" />
> >>   </form-beans>
> >>
> >>   <action-mappings>
> >>     <action path="/test"
> >> type="org.apache.struts.actions.ForwardAction"
> >> parameter="/test/test.jsp" />
> >>     <action path="/testForm" type="test.TestAction" name="testForm"
> >> input="/test.do" scope="request" />
> >>   </action-mappings>
> >>
> >>   <controller contentType="text/html;charset=UTF-8" />
> >>
> >> <struts-config/>
> >>
> >> test.TestAction:
> >> package test;
> >>
> >> import javax.servlet.http.*;
> >> import org.apache.commons.logging.*;
> >> import org.apache.struts.action.*;
> >>
> >> public class TestAction extends Action {
> >>     private static final Log log = LogFactory.getLog(TestAction.class);
> >>
> >>     public ActionForward execute(
> >>             ActionMapping mapping,
> >>             ActionForm pform,
> >>             HttpServletRequest request,
> >>             HttpServletResponse response)
> >>             throws Exception {
> >>         TestActionForm form = (TestActionForm)pform;
> >>         log.info("The value is: "+form.getTest());
> >>         return null;
> >>     }
> >> }
> >>
> >> test.TestActionForm:
> >> package test;
> >>
> >> import org.apache.struts.action.ActionForm;
> >>
> >> public class TestActionForm extends ActionForm {
> >>     private String test;
> >>     public String getTest() { return test;     }
> >>     public void setTest(String string) { test = string; }
> >> }
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
> >> For additional commands, e-mail: struts-user-help@jakarta.apache.org
> >>
> >>
> >>
> >>
> >
> >




---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org


Re: Problem with UTF-8 characters in a mutlipart/form-data encoded form

Posted by Paul Barry <pa...@nyu.edu>.
I think you are correct.  When I was looking at the packets and seeing two characters, it is actually the characters 
that are equal to the 2 bytes that make up the single UTF-8 character.  I thought the browser was somehow not correctly 
encoding my data, because it was turning 1 character into 2 characters, but actually it is UTF-8 encoding my character 
correctly.  So I think if I use something to read the data and convert it from UTF-8 to Unicode, I will get the correct 
data on the server.

So from reading the documentation about FileUpload, that seems to be the way to go, but now my question is how to 
integrate FileUpload with struts?  My thought would be to call a method to populate an ActionForm in the beginning of my 
action, and then use that ActionForm instead of the one I get from the requestProcessor.  So like this:

      public ActionForward execute(
              ActionMapping mapping,
              ActionForm pform,
              HttpServletRequest request,
              HttpServletResponse response)
              throws Exception {
          TestActionForm form = getFormUsingFileUpload(request);
          log.info("The value is: "+form.getTest());
          return null;
     }

Is this how others have used Jakarta Commons FileUpload with Struts, or is there a better way?


Jason Lea wrote:

>  From what I can see there Resin is expecting UTF-8 for any paramters 
> passed to it, and decoding it correctly.  However multipart/form-data is 
> treated differently as the data is not passed as normal parameters so 
> the request.getParameter() cannot be used here (and servlet filters that 
> set the request encoding won't help either).
> 
> You normally have to use something like the FileUpload component to 
> extract form fields and files from the request.  This component is not 
> going to know about the character encoding you have given to resin, so 
> it will use the default which is probably US-ASCII.  With UTF-8 a single 
> character can be rendered as 1, 2 or 3 bytes.  When decoding a UTF-8 
> string the decoder will combine the 1,2 or 3 byte combinations into 1 
> Unicode character.  When UTF-8 is not used to decode the string you will 
> see the individual bytes.
> 
> Looking here (the jakarta apache FileUpload component):
> http://jakarta.apache.org/commons/fileupload/apidocs/org/apache/commons/fileupload/FileUploadBase.html 
> 
> 
> They have a setHeaderEncoding() method which I assume will deal with 
> this problem (I haven't tested this so I don't know).  Are you using a 
> file upload component?
> 
> 
> Paul Barry wrote:
> 
>> I am using Struts 1.1 in an application that needs to support the 
>> UTF-8 character set.  I am using Resin 2.1.10 with 
>> character-encoding="UTF-8", and on most of my forms this seems to work 
>> just fine.  I am having problems with forms that have to use the 
>> multipart/form-data enctype for handling uploading files.  If I print 
>> out the value of a text element in an html:form where the enctype is 
>> not set at all (which ends up using 
>> application/x-www-form-urlencoded), using UTF-8 characters works 
>> fine.  This is what I get:
>>
>> INFO - test.TestAction - The value is: ä
>>
>> Here is what the actual HTTP request that gets sent to the server 
>> looks like:
>>
>> --- Start HTTP Request 
>> -----------------------------------------------------
>> POST /testForm.do HTTP/1.1
>> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
>> application/x-shockwave-flash, */*
>> Referer: http://pbdesktop/test.do
>> Accept-Language: en-us
>> Content-Type: application/x-www-form-urlencoded
>> Accept-Encoding: gzip, deflate
>> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
>> Host: pbdesktop
>> Content-Length: 11
>> Connection: Keep-Alive
>> Cache-Control: no-cache
>> Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
>>
>> test=%C3%AD
>> --- End HTTP Request 
>> ------------------------------------------------------
>>
>> But if I modify my html:form to use enctype="multipart/form-data", I 
>> get this:
>>
>> INFO - test.TestAction - The value is: A¤
>>
>> And the HTTP request looks like this:
>>
>> --- Start HTTP Request 
>> -----------------------------------------------------
>> POST /testForm.do HTTP/1.1
>> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
>> application/x-shockwave-flash, */*
>> Referer: http://pbdesktop/test.do
>> Accept-Language: en-us
>> Content-Type: multipart/form-data; 
>> boundary=---------------------------7d319628600e4
>> Accept-Encoding: gzip, deflate
>> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
>> Host: pbdesktop
>> Content-Length: 141
>> Connection: Keep-Alive
>> Cache-Control: no-cache
>> Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
>>
>> -----------------------------7d319628600e4
>> Content-Disposition: form-data; name="test"
>>
>> í
>> -----------------------------7d319628600e4-
>> --- End HTTP Request 
>> ------------------------------------------------------
>>
>> It looks as if the character is already messed up before it even gets 
>> to the servlet container.  There are messages in the mailing list 
>> archive that discuss this problem, but I didn't see a solution.  What 
>> is the best way to handle UTF-8 characters in a multipart/form-data 
>> encoded form?
>>
>> Here is the code that I am testing with:
>>
>> /test/test.jsp:
>> <%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
>> <%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>
>>
>> <html:html>
>>   <body>
>>     <html:form action="testForm.do" enctype="multipart/form-data">
>>       <html:text property="test" />
>>       <html:submit />
>>     </html:form>
>>   </body>
>> </html:html>
>>
>> Relavent parts of struts-config.xml:
>> <struts-config>
>>
>>   <form-beans>
>>     <form-bean name="testForm" type="test.TestActionForm" />
>>   </form-beans>
>>
>>   <action-mappings>
>>     <action path="/test" 
>> type="org.apache.struts.actions.ForwardAction" 
>> parameter="/test/test.jsp" />
>>     <action path="/testForm" type="test.TestAction" name="testForm" 
>> input="/test.do" scope="request" />
>>   </action-mappings>
>>
>>   <controller contentType="text/html;charset=UTF-8" />
>>
>> <struts-config/>
>>
>> test.TestAction:
>> package test;
>>
>> import javax.servlet.http.*;
>> import org.apache.commons.logging.*;
>> import org.apache.struts.action.*;
>>
>> public class TestAction extends Action {
>>     private static final Log log = LogFactory.getLog(TestAction.class);
>>     
>>     public ActionForward execute(
>>             ActionMapping mapping,
>>             ActionForm pform,
>>             HttpServletRequest request,
>>             HttpServletResponse response)
>>             throws Exception {
>>         TestActionForm form = (TestActionForm)pform;
>>         log.info("The value is: "+form.getTest());
>>         return null;
>>     }
>> }
>>
>> test.TestActionForm:
>> package test;
>>
>> import org.apache.struts.action.ActionForm;
>>
>> public class TestActionForm extends ActionForm {
>>     private String test;
>>     public String getTest() { return test;     }
>>     public void setTest(String string) { test = string; }
>> }
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: struts-user-help@jakarta.apache.org
>>
>>
>>  
>>
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org


Re: Problem with UTF-8 characters in a mutlipart/form-data encoded form

Posted by Jason Lea <ja...@kumachan.net.nz>.
 From what I can see there Resin is expecting UTF-8 for any paramters 
passed to it, and decoding it correctly.  However multipart/form-data is 
treated differently as the data is not passed as normal parameters so 
the request.getParameter() cannot be used here (and servlet filters that 
set the request encoding won't help either).

You normally have to use something like the FileUpload component to 
extract form fields and files from the request.  This component is not 
going to know about the character encoding you have given to resin, so 
it will use the default which is probably US-ASCII.  With UTF-8 a single 
character can be rendered as 1, 2 or 3 bytes.  When decoding a UTF-8 
string the decoder will combine the 1,2 or 3 byte combinations into 1 
Unicode character.  When UTF-8 is not used to decode the string you will 
see the individual bytes.

Looking here (the jakarta apache FileUpload component):
http://jakarta.apache.org/commons/fileupload/apidocs/org/apache/commons/fileupload/FileUploadBase.html

They have a setHeaderEncoding() method which I assume will deal with 
this problem (I haven't tested this so I don't know).  Are you using a 
file upload component?


Paul Barry wrote:

>I am using Struts 1.1 in an application that needs to support the UTF-8 character set.  I am using Resin 2.1.10 with 
>character-encoding="UTF-8", and on most of my forms this seems to work just fine.  I am having problems with forms that 
>have to use the multipart/form-data enctype for handling uploading files.  If I print out the value of a text element in 
>an html:form where the enctype is not set at all (which ends up using application/x-www-form-urlencoded), using UTF-8 
>characters works fine.  This is what I get:
>
>INFO - test.TestAction - The value is: ä
>
>Here is what the actual HTTP request that gets sent to the server looks like:
>
>--- Start HTTP Request -----------------------------------------------------
>POST /testForm.do HTTP/1.1
>Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */*
>Referer: http://pbdesktop/test.do
>Accept-Language: en-us
>Content-Type: application/x-www-form-urlencoded
>Accept-Encoding: gzip, deflate
>User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
>Host: pbdesktop
>Content-Length: 11
>Connection: Keep-Alive
>Cache-Control: no-cache
>Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
>
>test=%C3%AD
>--- End HTTP Request ------------------------------------------------------
>
>But if I modify my html:form to use enctype="multipart/form-data", I get this:
>
>INFO - test.TestAction - The value is: A¤
>
>And the HTTP request looks like this:
>
>--- Start HTTP Request -----------------------------------------------------
>POST /testForm.do HTTP/1.1
>Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */*
>Referer: http://pbdesktop/test.do
>Accept-Language: en-us
>Content-Type: multipart/form-data; boundary=---------------------------7d319628600e4
>Accept-Encoding: gzip, deflate
>User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
>Host: pbdesktop
>Content-Length: 141
>Connection: Keep-Alive
>Cache-Control: no-cache
>Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
>
>-----------------------------7d319628600e4
>Content-Disposition: form-data; name="test"
>
>í
>-----------------------------7d319628600e4-
>--- End HTTP Request ------------------------------------------------------
>
>It looks as if the character is already messed up before it even gets to the servlet container.  There are messages in 
>the mailing list archive that discuss this problem, but I didn't see a solution.  What is the best way to handle UTF-8 
>characters in a multipart/form-data encoded form?
>
>Here is the code that I am testing with:
>
>/test/test.jsp:
><%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
><%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>
>
><html:html>
>   <body>
>     <html:form action="testForm.do" enctype="multipart/form-data">
>       <html:text property="test" />
>       <html:submit />
>     </html:form>
>   </body>
></html:html>
>
>Relavent parts of struts-config.xml:
><struts-config>
>
>   <form-beans>
>     <form-bean name="testForm" type="test.TestActionForm" />
>   </form-beans>
>
>   <action-mappings>
>     <action path="/test" type="org.apache.struts.actions.ForwardAction" parameter="/test/test.jsp" />
>     <action path="/testForm" type="test.TestAction" name="testForm" input="/test.do" scope="request" />
>   </action-mappings>
>
>   <controller contentType="text/html;charset=UTF-8" />
>
><struts-config/>
>
>test.TestAction:
>package test;
>
>import javax.servlet.http.*;
>import org.apache.commons.logging.*;
>import org.apache.struts.action.*;
>
>public class TestAction extends Action {
>	private static final Log log = LogFactory.getLog(TestAction.class);
>	
>	public ActionForward execute(
>			ActionMapping mapping,
>			ActionForm pform,
>			HttpServletRequest request,
>			HttpServletResponse response)
>			throws Exception {
>		TestActionForm form = (TestActionForm)pform;
>		log.info("The value is: "+form.getTest());
>		return null;
>	}
>}
>
>test.TestActionForm:
>package test;
>
>import org.apache.struts.action.ActionForm;
>
>public class TestActionForm extends ActionForm {
>	private String test;
>	public String getTest() { return test; 	}
>	public void setTest(String string) { test = string; }
>}
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: struts-user-help@jakarta.apache.org
>
>
>  
>


-- 
Jason Lea




---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org


Re: Problem with UTF-8 characters in a mutlipart/form-data encoded form

Posted by Paul Barry <pa...@nyu.edu>.
I think the the problem is that the browser is not encoding the text field in UTF-8.  Supposedly setting the 
accept-charset attribute of the HTML form tag to UTF-8 will make it encode the field in UTF-8.  Unfortunately, there is 
no accept-charset property of the html:form strugs tag, although there is a request for one to be added:

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=21986

And on top of that, the accept-charset doesn't seems to work for me anyway when I try it outside of the html:form tag. 
For example, if I make an HTML form like this:

       <form action="testForm.do" enctype="multipart/form-data" method="post" accept-charset="UTF-8">
         <input type="text" name="test" />
         <input type="submit" />
       </form>

If I use ethereal to capture the HTTP request as it received by the server (before it ever gets to the actual servlet 
container) the cahacters don't show up correctly.  For example, the single charater á turns into the two characters á.

But, interestingly enough, using IE if I have View > Encoding set to Auto-Select, it does encode the data in UTF-8 and 
when it gets to my struts action, I correctly have a á.  But, if I uncheck View > Encoding > Auto-Select, even though 
just below that View > Encoding > Unicode (UTF-8) is selected, the data then doesn't get encoding correctly and I end up 
with á.  So it sounds like this isn't really a struts problem and more a browser problem, but how do I get the browser 
to encode the data in UTF-8?  I am doing something wrong with accept-charset?

José Gustavo Zagato wrote:

> I have doubts on it also... 
> The only thing that I'm doing at the front end is to set the encode to
> utf-8 I will double check it..
> 
> Regards...
> 
>   José Gustavo Zagato Rosa
> System Analyst - Atos Origin
> jose.gustavo@emacdigital.com.br
> 
> 
> -----Original Message-----
> From: Paul Barry [mailto:paul.barry@nyu.edu] 
> Sent: terça-feira, 28 de outubro de 2003 12:26
> To: Struts Users Mailing List
> Subject: Re: Problem with UTF-8 characters in a mutlipart/form-data
> encoded form
> 
> Does it work with multipart/form-data encoding?  It seems to me that
> this problem is happening before the form is 
> submitted to the servlet container (take a look at the value of "test"
> in the HTTP request with Content-Type: 
> multipart/form-data in my original post), so the servlet filter wouldn't
> help, but I could be wrong.
> 
> José Gustavo Zagato wrote:
> 
> 
>>Hi !
>>
>>	I don't if it will fit into your needs but, to handler UTF-8 I
>>build a serverlet filter with handles all encode / Decode operations.
> 
> As
> 
>>far as I know this approach is not a pure Struts solution but works
>>really fine !
>>I didn't test with a upload form like yours, but it’s a shot !
>>
>>Regards
>>
>>  José Gustavo Zagato Rosa
>>System Analyst - Atos Origin
>>jose.gustavo@emacdigital.com.br
>>
>>
>>-----Original Message-----
>>From: Paul Barry [mailto:paul.barry@nyu.edu] 
>>Sent: terça-feira, 28 de outubro de 2003 12:07
>>To: struts-user@jakarta.apache.org
>>Subject: Problem with UTF-8 characters in a mutlipart/form-data
> 
> encoded
> 
>>form
>>
>>I am using Struts 1.1 in an application that needs to support the
> 
> UTF-8
> 
>>character set.  I am using Resin 2.1.10 with 
>>character-encoding="UTF-8", and on most of my forms this seems to work
>>just fine.  I am having problems with forms that 
>>have to use the multipart/form-data enctype for handling uploading
>>files.  If I print out the value of a text element in 
>>an html:form where the enctype is not set at all (which ends up using
>>application/x-www-form-urlencoded), using UTF-8 
>>characters works fine.  This is what I get:
>>
>>INFO - test.TestAction - The value is: ä
>>
>>Here is what the actual HTTP request that gets sent to the server
> 
> looks
> 
>>like:
>>
>>--- Start HTTP Request
>>-----------------------------------------------------
>>POST /testForm.do HTTP/1.1
>>Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
>>application/x-shockwave-flash, */*
>>Referer: http://pbdesktop/test.do
>>Accept-Language: en-us
>>Content-Type: application/x-www-form-urlencoded
>>Accept-Encoding: gzip, deflate
>>User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
>>Host: pbdesktop
>>Content-Length: 11
>>Connection: Keep-Alive
>>Cache-Control: no-cache
>>Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
>>
>>test=%C3%AD
>>--- End HTTP Request
>>------------------------------------------------------
>>
>>But if I modify my html:form to use enctype="multipart/form-data", I
> 
> get
> 
>>this:
>>
>>INFO - test.TestAction - The value is: A¤
>>
>>And the HTTP request looks like this:
>>
>>--- Start HTTP Request
>>-----------------------------------------------------
>>POST /testForm.do HTTP/1.1
>>Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
>>application/x-shockwave-flash, */*
>>Referer: http://pbdesktop/test.do
>>Accept-Language: en-us
>>Content-Type: multipart/form-data;
>>boundary=---------------------------7d319628600e4
>>Accept-Encoding: gzip, deflate
>>User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
>>Host: pbdesktop
>>Content-Length: 141
>>Connection: Keep-Alive
>>Cache-Control: no-cache
>>Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
>>
>>-----------------------------7d319628600e4
>>Content-Disposition: form-data; name="test"
>>
>>í
>>-----------------------------7d319628600e4-
>>--- End HTTP Request
>>------------------------------------------------------
>>
>>It looks as if the character is already messed up before it even gets
> 
> to
> 
>>the servlet container.  There are messages in 
>>the mailing list archive that discuss this problem, but I didn't see a
>>solution.  What is the best way to handle UTF-8 
>>characters in a multipart/form-data encoded form?
>>
>>Here is the code that I am testing with:
>>
>>/test/test.jsp:
>><%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
>><%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>
>>
>><html:html>
>>   <body>
>>     <html:form action="testForm.do" enctype="multipart/form-data">
>>       <html:text property="test" />
>>       <html:submit />
>>     </html:form>
>>   </body>
>></html:html>
>>
>>Relavent parts of struts-config.xml:
>><struts-config>
>>
>>   <form-beans>
>>     <form-bean name="testForm" type="test.TestActionForm" />
>>   </form-beans>
>>
>>   <action-mappings>
>>     <action path="/test"
> 
> type="org.apache.struts.actions.ForwardAction"
> 
>>parameter="/test/test.jsp" />
>>     <action path="/testForm" type="test.TestAction" name="testForm"
>>input="/test.do" scope="request" />
>>   </action-mappings>
>>
>>   <controller contentType="text/html;charset=UTF-8" />
>>
>><struts-config/>
>>
>>test.TestAction:
>>package test;
>>
>>import javax.servlet.http.*;
>>import org.apache.commons.logging.*;
>>import org.apache.struts.action.*;
>>
>>public class TestAction extends Action {
>>	private static final Log log =
>>LogFactory.getLog(TestAction.class);
>>	
>>	public ActionForward execute(
>>			ActionMapping mapping,
>>			ActionForm pform,
>>			HttpServletRequest request,
>>			HttpServletResponse response)
>>			throws Exception {
>>		TestActionForm form = (TestActionForm)pform;
>>		log.info("The value is: "+form.getTest());
>>		return null;
>>	}
>>}
>>
>>test.TestActionForm:
>>package test;
>>
>>import org.apache.struts.action.ActionForm;
>>
>>public class TestActionForm extends ActionForm {
>>	private String test;
>>	public String getTest() { return test; 	}
>>	public void setTest(String string) { test = string; }
>>}
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: struts-user-help@jakarta.apache.org
>>
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: struts-user-help@jakarta.apache.org
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: struts-user-help@jakarta.apache.org
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: struts-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org


RE: Problem with UTF-8 characters in a mutlipart/form-data encoded form

Posted by José Gustavo Zagato <jo...@emacdigital.com.br>.
I have doubts on it also... 
The only thing that I'm doing at the front end is to set the encode to
utf-8 I will double check it..

Regards...

  José Gustavo Zagato Rosa
System Analyst - Atos Origin
jose.gustavo@emacdigital.com.br


-----Original Message-----
From: Paul Barry [mailto:paul.barry@nyu.edu] 
Sent: terça-feira, 28 de outubro de 2003 12:26
To: Struts Users Mailing List
Subject: Re: Problem with UTF-8 characters in a mutlipart/form-data
encoded form

Does it work with multipart/form-data encoding?  It seems to me that
this problem is happening before the form is 
submitted to the servlet container (take a look at the value of "test"
in the HTTP request with Content-Type: 
multipart/form-data in my original post), so the servlet filter wouldn't
help, but I could be wrong.

José Gustavo Zagato wrote:

> Hi !
> 
> 	I don't if it will fit into your needs but, to handler UTF-8 I
> build a serverlet filter with handles all encode / Decode operations.
As
> far as I know this approach is not a pure Struts solution but works
> really fine !
> I didn't test with a upload form like yours, but it’s a shot !
> 
> Regards
> 
>   José Gustavo Zagato Rosa
> System Analyst - Atos Origin
> jose.gustavo@emacdigital.com.br
> 
> 
> -----Original Message-----
> From: Paul Barry [mailto:paul.barry@nyu.edu] 
> Sent: terça-feira, 28 de outubro de 2003 12:07
> To: struts-user@jakarta.apache.org
> Subject: Problem with UTF-8 characters in a mutlipart/form-data
encoded
> form
> 
> I am using Struts 1.1 in an application that needs to support the
UTF-8
> character set.  I am using Resin 2.1.10 with 
> character-encoding="UTF-8", and on most of my forms this seems to work
> just fine.  I am having problems with forms that 
> have to use the multipart/form-data enctype for handling uploading
> files.  If I print out the value of a text element in 
> an html:form where the enctype is not set at all (which ends up using
> application/x-www-form-urlencoded), using UTF-8 
> characters works fine.  This is what I get:
> 
> INFO - test.TestAction - The value is: ä
> 
> Here is what the actual HTTP request that gets sent to the server
looks
> like:
> 
> --- Start HTTP Request
> -----------------------------------------------------
> POST /testForm.do HTTP/1.1
> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
> application/x-shockwave-flash, */*
> Referer: http://pbdesktop/test.do
> Accept-Language: en-us
> Content-Type: application/x-www-form-urlencoded
> Accept-Encoding: gzip, deflate
> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
> Host: pbdesktop
> Content-Length: 11
> Connection: Keep-Alive
> Cache-Control: no-cache
> Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
> 
> test=%C3%AD
> --- End HTTP Request
> ------------------------------------------------------
> 
> But if I modify my html:form to use enctype="multipart/form-data", I
get
> this:
> 
> INFO - test.TestAction - The value is: A¤
> 
> And the HTTP request looks like this:
> 
> --- Start HTTP Request
> -----------------------------------------------------
> POST /testForm.do HTTP/1.1
> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
> application/x-shockwave-flash, */*
> Referer: http://pbdesktop/test.do
> Accept-Language: en-us
> Content-Type: multipart/form-data;
> boundary=---------------------------7d319628600e4
> Accept-Encoding: gzip, deflate
> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
> Host: pbdesktop
> Content-Length: 141
> Connection: Keep-Alive
> Cache-Control: no-cache
> Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
> 
> -----------------------------7d319628600e4
> Content-Disposition: form-data; name="test"
> 
> í
> -----------------------------7d319628600e4-
> --- End HTTP Request
> ------------------------------------------------------
> 
> It looks as if the character is already messed up before it even gets
to
> the servlet container.  There are messages in 
> the mailing list archive that discuss this problem, but I didn't see a
> solution.  What is the best way to handle UTF-8 
> characters in a multipart/form-data encoded form?
> 
> Here is the code that I am testing with:
> 
> /test/test.jsp:
> <%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
> <%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>
> 
> <html:html>
>    <body>
>      <html:form action="testForm.do" enctype="multipart/form-data">
>        <html:text property="test" />
>        <html:submit />
>      </html:form>
>    </body>
> </html:html>
> 
> Relavent parts of struts-config.xml:
> <struts-config>
> 
>    <form-beans>
>      <form-bean name="testForm" type="test.TestActionForm" />
>    </form-beans>
> 
>    <action-mappings>
>      <action path="/test"
type="org.apache.struts.actions.ForwardAction"
> parameter="/test/test.jsp" />
>      <action path="/testForm" type="test.TestAction" name="testForm"
> input="/test.do" scope="request" />
>    </action-mappings>
> 
>    <controller contentType="text/html;charset=UTF-8" />
> 
> <struts-config/>
> 
> test.TestAction:
> package test;
> 
> import javax.servlet.http.*;
> import org.apache.commons.logging.*;
> import org.apache.struts.action.*;
> 
> public class TestAction extends Action {
> 	private static final Log log =
> LogFactory.getLog(TestAction.class);
> 	
> 	public ActionForward execute(
> 			ActionMapping mapping,
> 			ActionForm pform,
> 			HttpServletRequest request,
> 			HttpServletResponse response)
> 			throws Exception {
> 		TestActionForm form = (TestActionForm)pform;
> 		log.info("The value is: "+form.getTest());
> 		return null;
> 	}
> }
> 
> test.TestActionForm:
> package test;
> 
> import org.apache.struts.action.ActionForm;
> 
> public class TestActionForm extends ActionForm {
> 	private String test;
> 	public String getTest() { return test; 	}
> 	public void setTest(String string) { test = string; }
> }
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: struts-user-help@jakarta.apache.org
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: struts-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org


Re: Problem with UTF-8 characters in a mutlipart/form-data encoded form

Posted by Paul Barry <pa...@nyu.edu>.
Does it work with multipart/form-data encoding?  It seems to me that this problem is happening before the form is 
submitted to the servlet container (take a look at the value of "test" in the HTTP request with Content-Type: 
multipart/form-data in my original post), so the servlet filter wouldn't help, but I could be wrong.

José Gustavo Zagato wrote:

> Hi !
> 
> 	I don't if it will fit into your needs but, to handler UTF-8 I
> build a serverlet filter with handles all encode / Decode operations. As
> far as I know this approach is not a pure Struts solution but works
> really fine !
> I didn't test with a upload form like yours, but it’s a shot !
> 
> Regards
> 
>   José Gustavo Zagato Rosa
> System Analyst - Atos Origin
> jose.gustavo@emacdigital.com.br
> 
> 
> -----Original Message-----
> From: Paul Barry [mailto:paul.barry@nyu.edu] 
> Sent: terça-feira, 28 de outubro de 2003 12:07
> To: struts-user@jakarta.apache.org
> Subject: Problem with UTF-8 characters in a mutlipart/form-data encoded
> form
> 
> I am using Struts 1.1 in an application that needs to support the UTF-8
> character set.  I am using Resin 2.1.10 with 
> character-encoding="UTF-8", and on most of my forms this seems to work
> just fine.  I am having problems with forms that 
> have to use the multipart/form-data enctype for handling uploading
> files.  If I print out the value of a text element in 
> an html:form where the enctype is not set at all (which ends up using
> application/x-www-form-urlencoded), using UTF-8 
> characters works fine.  This is what I get:
> 
> INFO - test.TestAction - The value is: ä
> 
> Here is what the actual HTTP request that gets sent to the server looks
> like:
> 
> --- Start HTTP Request
> -----------------------------------------------------
> POST /testForm.do HTTP/1.1
> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
> application/x-shockwave-flash, */*
> Referer: http://pbdesktop/test.do
> Accept-Language: en-us
> Content-Type: application/x-www-form-urlencoded
> Accept-Encoding: gzip, deflate
> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
> Host: pbdesktop
> Content-Length: 11
> Connection: Keep-Alive
> Cache-Control: no-cache
> Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
> 
> test=%C3%AD
> --- End HTTP Request
> ------------------------------------------------------
> 
> But if I modify my html:form to use enctype="multipart/form-data", I get
> this:
> 
> INFO - test.TestAction - The value is: A¤
> 
> And the HTTP request looks like this:
> 
> --- Start HTTP Request
> -----------------------------------------------------
> POST /testForm.do HTTP/1.1
> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
> application/x-shockwave-flash, */*
> Referer: http://pbdesktop/test.do
> Accept-Language: en-us
> Content-Type: multipart/form-data;
> boundary=---------------------------7d319628600e4
> Accept-Encoding: gzip, deflate
> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
> Host: pbdesktop
> Content-Length: 141
> Connection: Keep-Alive
> Cache-Control: no-cache
> Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd
> 
> -----------------------------7d319628600e4
> Content-Disposition: form-data; name="test"
> 
> í
> -----------------------------7d319628600e4-
> --- End HTTP Request
> ------------------------------------------------------
> 
> It looks as if the character is already messed up before it even gets to
> the servlet container.  There are messages in 
> the mailing list archive that discuss this problem, but I didn't see a
> solution.  What is the best way to handle UTF-8 
> characters in a multipart/form-data encoded form?
> 
> Here is the code that I am testing with:
> 
> /test/test.jsp:
> <%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
> <%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>
> 
> <html:html>
>    <body>
>      <html:form action="testForm.do" enctype="multipart/form-data">
>        <html:text property="test" />
>        <html:submit />
>      </html:form>
>    </body>
> </html:html>
> 
> Relavent parts of struts-config.xml:
> <struts-config>
> 
>    <form-beans>
>      <form-bean name="testForm" type="test.TestActionForm" />
>    </form-beans>
> 
>    <action-mappings>
>      <action path="/test" type="org.apache.struts.actions.ForwardAction"
> parameter="/test/test.jsp" />
>      <action path="/testForm" type="test.TestAction" name="testForm"
> input="/test.do" scope="request" />
>    </action-mappings>
> 
>    <controller contentType="text/html;charset=UTF-8" />
> 
> <struts-config/>
> 
> test.TestAction:
> package test;
> 
> import javax.servlet.http.*;
> import org.apache.commons.logging.*;
> import org.apache.struts.action.*;
> 
> public class TestAction extends Action {
> 	private static final Log log =
> LogFactory.getLog(TestAction.class);
> 	
> 	public ActionForward execute(
> 			ActionMapping mapping,
> 			ActionForm pform,
> 			HttpServletRequest request,
> 			HttpServletResponse response)
> 			throws Exception {
> 		TestActionForm form = (TestActionForm)pform;
> 		log.info("The value is: "+form.getTest());
> 		return null;
> 	}
> }
> 
> test.TestActionForm:
> package test;
> 
> import org.apache.struts.action.ActionForm;
> 
> public class TestActionForm extends ActionForm {
> 	private String test;
> 	public String getTest() { return test; 	}
> 	public void setTest(String string) { test = string; }
> }
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: struts-user-help@jakarta.apache.org
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: struts-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org


RE: Problem with UTF-8 characters in a mutlipart/form-data encoded form

Posted by javen fang <fa...@yahoo.com.cn>.
It's true, although servlet filter is not pure struts
method, but it is used to solve character-encoding
widely in struts framework. 


--- Jos&#38283;Gustavo_Zagato
<jo...@emacdigital.com.br> wrote:
> Hi !
> 
> 	I don't if it will fit into your needs but, to
> handler UTF-8 I
> build a serverlet filter with handles all encode /
> Decode operations. As
> far as I know this approach is not a pure Struts
> solution but works
> really fine !
> I didn't test with a upload form like yours, but
> it&#25263; a shot !
> 
> Regards
> 
>   Jos?Gustavo Zagato Rosa
> System Analyst - Atos Origin
> jose.gustavo@emacdigital.com.br
> 
> 
> -----Original Message-----
> From: Paul Barry [mailto:paul.barry@nyu.edu] 
> Sent: ter&#37872;-feira, 28 de outubro de 2003 12:07
> To: struts-user@jakarta.apache.org
> Subject: Problem with UTF-8 characters in a
> mutlipart/form-data encoded
> form
> 
> I am using Struts 1.1 in an application that needs
> to support the UTF-8
> character set.  I am using Resin 2.1.10 with 
> character-encoding="UTF-8", and on most of my forms
> this seems to work
> just fine.  I am having problems with forms that 
> have to use the multipart/form-data enctype for
> handling uploading
> files.  If I print out the value of a text element
> in 
> an html:form where the enctype is not set at all
> (which ends up using
> application/x-www-form-urlencoded), using UTF-8 
> characters works fine.  This is what I get:
> 
> INFO - test.TestAction - The value is: ?> 
> Here is what the actual HTTP request that gets sent
> to the server looks
> like:
> 
> --- Start HTTP Request
>
-----------------------------------------------------
> POST /testForm.do HTTP/1.1
> Accept: image/gif, image/x-xbitmap, image/jpeg,
> image/pjpeg,
> application/x-shockwave-flash, */*
> Referer: http://pbdesktop/test.do
> Accept-Language: en-us
> Content-Type: application/x-www-form-urlencoded
> Accept-Encoding: gzip, deflate
> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0;
> Windows NT 5.0)
> Host: pbdesktop
> Content-Length: 11
> Connection: Keep-Alive
> Cache-Control: no-cache
> Cookie: SERVER=op; locale=en_US;
> JSESSIONID=aoUCARQpqsLd
> 
> test=%C3%AD
> --- End HTTP Request
>
------------------------------------------------------
> 
> But if I modify my html:form to use
> enctype="multipart/form-data", I get
> this:
> 
> INFO - test.TestAction - The value is: A?> 
> And the HTTP request looks like this:
> 
> --- Start HTTP Request
>
-----------------------------------------------------
> POST /testForm.do HTTP/1.1
> Accept: image/gif, image/x-xbitmap, image/jpeg,
> image/pjpeg,
> application/x-shockwave-flash, */*
> Referer: http://pbdesktop/test.do
> Accept-Language: en-us
> Content-Type: multipart/form-data;
> boundary=---------------------------7d319628600e4
> Accept-Encoding: gzip, deflate
> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0;
> Windows NT 5.0)
> Host: pbdesktop
> Content-Length: 141
> Connection: Keep-Alive
> Cache-Control: no-cache
> Cookie: SERVER=op; locale=en_US;
> JSESSIONID=aoUCARQpqsLd
> 
> -----------------------------7d319628600e4
> Content-Disposition: form-data; name="test"
> 
> &#38086;
> -----------------------------7d319628600e4-
> --- End HTTP Request
>
------------------------------------------------------
> 
> It looks as if the character is already messed up
> before it even gets to
> the servlet container.  There are messages in 
> the mailing list archive that discuss this problem,
> but I didn't see a
> solution.  What is the best way to handle UTF-8 
> characters in a multipart/form-data encoded form?
> 
> Here is the code that I am testing with:
> 
> /test/test.jsp:
> <%@ taglib uri="WEB-INF/taglib/struts-html.tld"
> prefix="html" %>
> <%@ taglib uri="WEB-INF/taglib/struts-bean.tld"
> prefix="bean" %>
> 
> <html:html>
>    <body>
>      <html:form action="testForm.do"
> enctype="multipart/form-data">
>        <html:text property="test" />
>        <html:submit />
>      </html:form>
>    </body>
> </html:html>
> 
> Relavent parts of struts-config.xml:
> <struts-config>
> 
>    <form-beans>
>      <form-bean name="testForm"
> type="test.TestActionForm" />
>    </form-beans>
> 
>    <action-mappings>
>      <action path="/test"
> type="org.apache.struts.actions.ForwardAction"
> parameter="/test/test.jsp" />
>      <action path="/testForm" type="test.TestAction"
> name="testForm"
> input="/test.do" scope="request" />
>    </action-mappings>
> 
>    <controller contentType="text/html;charset=UTF-8"
> />
> 
> <struts-config/>
> 
> test.TestAction:
> package test;
> 
> import javax.servlet.http.*;
> import org.apache.commons.logging.*;
> import org.apache.struts.action.*;
> 
> public class TestAction extends Action {
> 	private static final Log log =
> LogFactory.getLog(TestAction.class);
> 	
> 	public ActionForward execute(
> 			ActionMapping mapping,
> 			ActionForm pform,
> 			HttpServletRequest request,
> 			HttpServletResponse response)
> 			throws Exception {
> 		TestActionForm form = (TestActionForm)pform;
> 		log.info("The value is: "+form.getTest());
> 		return null;
> 	}
> }
> 
> test.TestActionForm:
> package test;
> 
> import org.apache.struts.action.ActionForm;
> 
> public class TestActionForm extends ActionForm {
> 	private String test;
> 	public String getTest() { return test; 	}
> 	public void setTest(String string) { test = string;
> }
> }
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> struts-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> struts-user-help@jakarta.apache.org
> 
> 
> 
> 
>
---------------------------------------------------------------------
> 
=== message truncated ===


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org


RE: Problem with UTF-8 characters in a mutlipart/form-data encoded form

Posted by José Gustavo Zagato <jo...@emacdigital.com.br>.
Hi !

	I don't if it will fit into your needs but, to handler UTF-8 I
build a serverlet filter with handles all encode / Decode operations. As
far as I know this approach is not a pure Struts solution but works
really fine !
I didn't test with a upload form like yours, but it’s a shot !

Regards

  José Gustavo Zagato Rosa
System Analyst - Atos Origin
jose.gustavo@emacdigital.com.br


-----Original Message-----
From: Paul Barry [mailto:paul.barry@nyu.edu] 
Sent: terça-feira, 28 de outubro de 2003 12:07
To: struts-user@jakarta.apache.org
Subject: Problem with UTF-8 characters in a mutlipart/form-data encoded
form

I am using Struts 1.1 in an application that needs to support the UTF-8
character set.  I am using Resin 2.1.10 with 
character-encoding="UTF-8", and on most of my forms this seems to work
just fine.  I am having problems with forms that 
have to use the multipart/form-data enctype for handling uploading
files.  If I print out the value of a text element in 
an html:form where the enctype is not set at all (which ends up using
application/x-www-form-urlencoded), using UTF-8 
characters works fine.  This is what I get:

INFO - test.TestAction - The value is: ä

Here is what the actual HTTP request that gets sent to the server looks
like:

--- Start HTTP Request
-----------------------------------------------------
POST /testForm.do HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/x-shockwave-flash, */*
Referer: http://pbdesktop/test.do
Accept-Language: en-us
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
Host: pbdesktop
Content-Length: 11
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd

test=%C3%AD
--- End HTTP Request
------------------------------------------------------

But if I modify my html:form to use enctype="multipart/form-data", I get
this:

INFO - test.TestAction - The value is: A¤

And the HTTP request looks like this:

--- Start HTTP Request
-----------------------------------------------------
POST /testForm.do HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/x-shockwave-flash, */*
Referer: http://pbdesktop/test.do
Accept-Language: en-us
Content-Type: multipart/form-data;
boundary=---------------------------7d319628600e4
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
Host: pbdesktop
Content-Length: 141
Connection: Keep-Alive
Cache-Control: no-cache
Cookie: SERVER=op; locale=en_US; JSESSIONID=aoUCARQpqsLd

-----------------------------7d319628600e4
Content-Disposition: form-data; name="test"

í
-----------------------------7d319628600e4-
--- End HTTP Request
------------------------------------------------------

It looks as if the character is already messed up before it even gets to
the servlet container.  There are messages in 
the mailing list archive that discuss this problem, but I didn't see a
solution.  What is the best way to handle UTF-8 
characters in a multipart/form-data encoded form?

Here is the code that I am testing with:

/test/test.jsp:
<%@ taglib uri="WEB-INF/taglib/struts-html.tld" prefix="html" %>
<%@ taglib uri="WEB-INF/taglib/struts-bean.tld" prefix="bean" %>

<html:html>
   <body>
     <html:form action="testForm.do" enctype="multipart/form-data">
       <html:text property="test" />
       <html:submit />
     </html:form>
   </body>
</html:html>

Relavent parts of struts-config.xml:
<struts-config>

   <form-beans>
     <form-bean name="testForm" type="test.TestActionForm" />
   </form-beans>

   <action-mappings>
     <action path="/test" type="org.apache.struts.actions.ForwardAction"
parameter="/test/test.jsp" />
     <action path="/testForm" type="test.TestAction" name="testForm"
input="/test.do" scope="request" />
   </action-mappings>

   <controller contentType="text/html;charset=UTF-8" />

<struts-config/>

test.TestAction:
package test;

import javax.servlet.http.*;
import org.apache.commons.logging.*;
import org.apache.struts.action.*;

public class TestAction extends Action {
	private static final Log log =
LogFactory.getLog(TestAction.class);
	
	public ActionForward execute(
			ActionMapping mapping,
			ActionForm pform,
			HttpServletRequest request,
			HttpServletResponse response)
			throws Exception {
		TestActionForm form = (TestActionForm)pform;
		log.info("The value is: "+form.getTest());
		return null;
	}
}

test.TestActionForm:
package test;

import org.apache.struts.action.ActionForm;

public class TestActionForm extends ActionForm {
	private String test;
	public String getTest() { return test; 	}
	public void setTest(String string) { test = string; }
}


---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: struts-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: struts-user-help@jakarta.apache.org