You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Ylva Degerfeldt <yl...@gmail.com> on 2008/04/29 22:22:45 UTC

Reading unicode characters from a form in multipart/form-data format

Hi everyone!

I joined this list in hope of that someone can help me with this problem:

Is there a way to use UTF-8 when reading from form fields with
multipart/form-data format? (I'm using Apache commons
ServletFileUpload so that's why I'm asking on this mailing list.)

I've searched the web for answers and though this seems to be a well
known problem, I haven't found any solution to it.

I have to have a multipart/form-data form because I want the users to
be able to save their names and birthdate on the same jsp page as
where they can upload a (cv) file, so I have to find a solution.

My jsp page has the meta tag attribute charset="UTF-8", the form has
the attribute accept-charset="UTF-8" and in the servlet which receives
this request I do a:

request.setCharacterEncoding("UTF-8");

but none of those things seem to help.

I also tried using the setHeaderEncoding method of the
ServletFileUpload class but that didn't help either. I still receive
Swedish characters as "weird looking characters" and that's not what I
want.

If anyone has a solution to this problem I would be Very grateful!
(Even if the solution means that I have to convert the strings I
receive in some way.)

Thanks in advance,

Ylva

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: Reading unicode characters from a form in multipart/form-data format

Posted by Ylva Degerfeldt <yl...@gmail.com>.
Thank you, Jason and Christian!

Jason, I tried changing "UTF-8" to "utf-8" in all the places where I'm
using it but that didn't work. But thanks anyway!

Thank you, Christian for showing me a solution to the problem!
The getBytes thing was the important part, but I had to solve it like this:
String newName = new String(oldName.getBytes("ISO-8859-1"), "utf-8");

Anyway, now it Works and that's GREAT!!
Tack!!

/Ylva

On Tue, May 6, 2008 at 8:42 AM, Christian Andersson <ca...@ofs.no> wrote:
> Hi there, if you don't get it solved with the new jar that you got from Song
> Jason youmight try to convert the text yourself.. (I have done this when
> I've been using commons fileupload.
>
>  what is happening is that commons fileupload is reading the text as if it
> was encoded in ISO-8859-1, so here is what I have been using to convert it
> to UTF-8
>
>  String name = ...   // get parameter from commons file upload
>  name =name.getBytes("ISO-8859-1"),"UTF-8");
>
>
>
>
>  Ylva Degerfeldt skrev:
>
> >
> >
> >
> > Hi everyone!
> >
> > I joined this list in hope of that someone can help me with this problem:
> >
> > Is there a way to use UTF-8 when reading from form fields with
> > multipart/form-data format? (I'm using Apache commons
> > ServletFileUpload so that's why I'm asking on this mailing list.)
> >
> > I've searched the web for answers and though this seems to be a well
> > known problem, I haven't found any solution to it.
> >
> > I have to have a multipart/form-data form because I want the users to
> > be able to save their names and birthdate on the same jsp page as
> > where they can upload a (cv) file, so I have to find a solution.
> >
> > My jsp page has the meta tag attribute charset="UTF-8", the form has
> > the attribute accept-charset="UTF-8" and in the servlet which receives
> > this request I do a:
> >
> > request.setCharacterEncoding("UTF-8");
> >
> > but none of those things seem to help.
> >
> > I also tried using the setHeaderEncoding method of the
> > ServletFileUpload class but that didn't help either. I still receive
> > Swedish characters as "weird looking characters" and that's not what I
> > want.
> >
> > If anyone has a solution to this problem I would be Very grateful!
> > (Even if the solution means that I have to convert the strings I
> > receive in some way.)
> >
> > Thanks in advance,
> >
> > Ylva
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> > For additional commands, e-mail: user-help@commons.apache.org
> >
> >
>
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>  For additional commands, e-mail: user-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: Reading unicode characters from a form in multipart/form-data format

Posted by Christian Andersson <ca...@ofs.no>.
Hi there, if you don't get it solved with the new jar that you got from 
Song Jason youmight try to convert the text yourself.. (I have done this 
when I've been using commons fileupload.

what is happening is that commons fileupload is reading the text as if 
it was encoded in ISO-8859-1, so here is what I have been using to 
convert it to UTF-8

String name = ...   // get parameter from commons file upload
name =name.getBytes("ISO-8859-1"),"UTF-8");




Ylva Degerfeldt skrev:
> Hi everyone!
> 
> I joined this list in hope of that someone can help me with this problem:
> 
> Is there a way to use UTF-8 when reading from form fields with
> multipart/form-data format? (I'm using Apache commons
> ServletFileUpload so that's why I'm asking on this mailing list.)
> 
> I've searched the web for answers and though this seems to be a well
> known problem, I haven't found any solution to it.
> 
> I have to have a multipart/form-data form because I want the users to
> be able to save their names and birthdate on the same jsp page as
> where they can upload a (cv) file, so I have to find a solution.
> 
> My jsp page has the meta tag attribute charset="UTF-8", the form has
> the attribute accept-charset="UTF-8" and in the servlet which receives
> this request I do a:
> 
> request.setCharacterEncoding("UTF-8");
> 
> but none of those things seem to help.
> 
> I also tried using the setHeaderEncoding method of the
> ServletFileUpload class but that didn't help either. I still receive
> Swedish characters as "weird looking characters" and that's not what I
> want.
> 
> If anyone has a solution to this problem I would be Very grateful!
> (Even if the solution means that I have to convert the strings I
> receive in some way.)
> 
> Thanks in advance,
> 
> Ylva
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: Reading unicode characters from a form in multipart/form-data format

Posted by Song Jason <ja...@gmail.com>.
Hi *Ylva,*
**
Correctly I got your problem. Because I am using  multipart/form-data form
to submit data to server all the ways no matter include a file or not. I set
the response.setCharset("utf-8") and request.setCharset("utf-8") and
everything works fine now. I tested it using Chinese Charactors. But the
official version has problem while you uploading a file with the file name
using utf-8 while my version corrected it. (maybe new version solved?) Once
I sufferred the same problem as yours, so I checked the codes and found
that  'UTF-8' is different from 'utf-8'. The string 'utf-8' is preferred.

Hope it helps,
Jason

On 5/5/08, Ylva Degerfeldt <yl...@gmail.com> wrote:
>
> Hello again,
>
> Thank you, Jason for wanting to help me, but I'm afraid it still is
> not working. (I've taken a few days off. That's why I haven't answered
> earlier.)
>
> I downloaded your commons-uploads.jar, removed the old
> commons-fileupload-1.2.1.jar from my project and added your version of
> the jar to my project but it still is not possible to save Swedish
> letters from fields in the multipart/form-data form.
> Are you sure you solved it?
> If you think so, could you please tell me more about how you solved it
> and if I have to do anything other than adding your jar to libraries
> (I'm using Netbeans 5.5.1) to make it work?
>
> It seems though that you Might have misinterpreted me. I don't have a
> problem with when the file name contains unicode characters. It's when
> you have an additional field of the type "text" in the same form (like
> I have for the name of a person), and unicode characters (Swedish
> characters) are submitted from that field.
>
> Hoping to get more help!
>
>
> Thanks anyway, Jason!
>
> /Ylva
>
>
> On Wed, Apr 30, 2008 at 12:48 AM, Song Jason <ja...@gmail.com> wrote:
> > Hi Ylva,
> >
> > This should be a commons-upload problem, also known as uploading file
> name
> > doesnot support utf-8 and can be just treat as ascii.
> > attached my version for commons-upload.jar, have a try. I believe I
> solved
> > it.
> >
> > Good luck.
> > Jason Song
> >
> >
> > On Tue, Apr 29, 2008 at 4:22 PM, Ylva Degerfeldt <
> ylva.degerfeldt@gmail.com>
> > wrote:
> >
> > >
> > >
> > >
> > > Hi everyone!
> > >
> > > I joined this list in hope of that someone can help me with this
> problem:
> > >
> > > Is there a way to use UTF-8 when reading from form fields with
> > > multipart/form-data format? (I'm using Apache commons
> > > ServletFileUpload so that's why I'm asking on this mailing list.)
> > >
> > > I've searched the web for answers and though this seems to be a well
> > > known problem, I haven't found any solution to it.
> > >
> > > I have to have a multipart/form-data form because I want the users to
> > > be able to save their names and birthdate on the same jsp page as
> > > where they can upload a (cv) file, so I have to find a solution.
> > >
> > > My jsp page has the meta tag attribute charset="UTF-8", the form has
> > > the attribute accept-charset="UTF-8" and in the servlet which receives
> > > this request I do a:
> > >
> > > request.setCharacterEncoding("UTF-8");
> > >
> > > but none of those things seem to help.
> > >
> > > I also tried using the setHeaderEncoding method of the
> > > ServletFileUpload class but that didn't help either. I still receive
> > > Swedish characters as "weird looking characters" and that's not what I
> > > want.
> > >
> > > If anyone has a solution to this problem I would be Very grateful!
> > > (Even if the solution means that I have to convert the strings I
> > > receive in some way.)
> > >
> > > Thanks in advance,
> > >
> > > Ylva
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> > > For additional commands, e-mail: user-help@commons.apache.org
> > >
> > >
> >
> >
> > ---------------------------------------------------------------------
> >  To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> >  For additional commands, e-mail: user-help@commons.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

Re: Reading unicode characters from a form in multipart/form-data format

Posted by Ylva Degerfeldt <yl...@gmail.com>.
Hello again,

Thank you, Jason for wanting to help me, but I'm afraid it still is
not working. (I've taken a few days off. That's why I haven't answered
earlier.)

I downloaded your commons-uploads.jar, removed the old
commons-fileupload-1.2.1.jar from my project and added your version of
the jar to my project but it still is not possible to save Swedish
letters from fields in the multipart/form-data form.
Are you sure you solved it?
If you think so, could you please tell me more about how you solved it
and if I have to do anything other than adding your jar to libraries
(I'm using Netbeans 5.5.1) to make it work?

It seems though that you Might have misinterpreted me. I don't have a
problem with when the file name contains unicode characters. It's when
you have an additional field of the type "text" in the same form (like
I have for the name of a person), and unicode characters (Swedish
characters) are submitted from that field.

Hoping to get more help!


Thanks anyway, Jason!

/Ylva


On Wed, Apr 30, 2008 at 12:48 AM, Song Jason <ja...@gmail.com> wrote:
> Hi Ylva,
>
> This should be a commons-upload problem, also known as uploading file name
> doesnot support utf-8 and can be just treat as ascii.
> attached my version for commons-upload.jar, have a try. I believe I solved
> it.
>
> Good luck.
> Jason Song
>
>
> On Tue, Apr 29, 2008 at 4:22 PM, Ylva Degerfeldt <yl...@gmail.com>
> wrote:
>
> >
> >
> >
> > Hi everyone!
> >
> > I joined this list in hope of that someone can help me with this problem:
> >
> > Is there a way to use UTF-8 when reading from form fields with
> > multipart/form-data format? (I'm using Apache commons
> > ServletFileUpload so that's why I'm asking on this mailing list.)
> >
> > I've searched the web for answers and though this seems to be a well
> > known problem, I haven't found any solution to it.
> >
> > I have to have a multipart/form-data form because I want the users to
> > be able to save their names and birthdate on the same jsp page as
> > where they can upload a (cv) file, so I have to find a solution.
> >
> > My jsp page has the meta tag attribute charset="UTF-8", the form has
> > the attribute accept-charset="UTF-8" and in the servlet which receives
> > this request I do a:
> >
> > request.setCharacterEncoding("UTF-8");
> >
> > but none of those things seem to help.
> >
> > I also tried using the setHeaderEncoding method of the
> > ServletFileUpload class but that didn't help either. I still receive
> > Swedish characters as "weird looking characters" and that's not what I
> > want.
> >
> > If anyone has a solution to this problem I would be Very grateful!
> > (Even if the solution means that I have to convert the strings I
> > receive in some way.)
> >
> > Thanks in advance,
> >
> > Ylva
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> > For additional commands, e-mail: user-help@commons.apache.org
> >
> >
>
>
> ---------------------------------------------------------------------
>  To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>  For additional commands, e-mail: user-help@commons.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Re: Reading unicode characters from a form in multipart/form-data format

Posted by Song Jason <ja...@gmail.com>.
Hi Ylva,

This should be a commons-upload problem, also known as uploading file
name doesnot support utf-8 and can be just treat as ascii.
attached my version for commons-upload.jar, have a try. I believe I solved
it.

Good luck.
Jason Song
On Tue, Apr 29, 2008 at 4:22 PM, Ylva Degerfeldt <yl...@gmail.com>
wrote:

> Hi everyone!
>
> I joined this list in hope of that someone can help me with this problem:
>
> Is there a way to use UTF-8 when reading from form fields with
> multipart/form-data format? (I'm using Apache commons
> ServletFileUpload so that's why I'm asking on this mailing list.)
>
> I've searched the web for answers and though this seems to be a well
> known problem, I haven't found any solution to it.
>
> I have to have a multipart/form-data form because I want the users to
> be able to save their names and birthdate on the same jsp page as
> where they can upload a (cv) file, so I have to find a solution.
>
> My jsp page has the meta tag attribute charset="UTF-8", the form has
> the attribute accept-charset="UTF-8" and in the servlet which receives
> this request I do a:
>
> request.setCharacterEncoding("UTF-8");
>
> but none of those things seem to help.
>
> I also tried using the setHeaderEncoding method of the
> ServletFileUpload class but that didn't help either. I still receive
> Swedish characters as "weird looking characters" and that's not what I
> want.
>
> If anyone has a solution to this problem I would be Very grateful!
> (Even if the solution means that I have to convert the strings I
> receive in some way.)
>
> Thanks in advance,
>
> Ylva
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>