You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Tomislav Brkljačić <to...@gmail.com> on 2011/04/07 15:51:44 UTC

[ win xp and win server 2003 ] tomcat utf8 encoding

Hi to all,

this is my scenario and problem. 

Situation 1. - local machine, win xp
I have a web app deployed to tomcat, and the app has a webform for uploading
attachments.
Attachments can have funny letters (š,ć,čćžđ ) in the filename.
I have set the file.encoding=UTF8 and UriEncoding = UTF8 for jvm and inside
the server.xml.
Everything works as expected, no anomalies in displaying the filenames of
the uploaded files.

Situation 2. - client machine, win server 2003
Same webapp as in Situation 1, same tomcat configuration in all matters.
But there is  aproblem.
After i upload the files with funny names through the app, the filenames are
scrambled and garbled.
I checked the location of the files in the file system, and of course 
uploadaed filenames are
acrambled in the file system too.

Obviously there is some other setting i need to check and syncronize, but it
eludes me so far..

Any help is very appreciated.

Tomislav


-- 
View this message in context: http://old.nabble.com/--win-xp-and-win-server-2003---tomcat-utf8-encoding-tp31342723p31342723.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


RE: localhost:8080

Posted by "Caldarale, Charles R" <Ch...@unisys.com>.
> From: ken dias [mailto:kendias@hotmail.com] 
> Subject: localhost:8080

> I have to type localhost:8080 to get to my webserver - tomcat 
> 6.0.032. Before, just localhost would work. I have XP
 
> Thanks,
 
> Ken
 
> > Date: Sat, 9 Apr 2011 18:26:11 +0100
> > Subject: Re: [ win xp and win server 2003 ] tomcat utf8 encoding
> > From: fadil.work@gmail.com
> > To: users@tomcat.apache.org
> > 
> > unsubscribe
> > 
> > 2011/4/7 Tomislav Brkljačić <to...@gmail.com>
> > 
> > >
> > > Hi to all,
> > >
> > > this is my scenario and problem.

Don't hijack threads.  Start a new one, rather being lazy and replying to an unrelated thread.

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


localhost:8080

Posted by ken dias <ke...@hotmail.com>.
I have to type localhost:8080 to get to my webserver - tomcat 6.0.032. Before, just localhost would work. I have XP
 
Thanks,
 
Ken
 
> Date: Sat, 9 Apr 2011 18:26:11 +0100
> Subject: Re: [ win xp and win server 2003 ] tomcat utf8 encoding
> From: fadil.work@gmail.com
> To: users@tomcat.apache.org
> 
> unsubscribe
> 
> 2011/4/7 Tomislav Brkljačić <to...@gmail.com>
> 
> >
> > Hi to all,
> >
> > this is my scenario and problem.
> >
> > Situation 1. - local machine, win xp
> > I have a web app deployed to tomcat, and the app has a webform for
> > uploading
> > attachments.
> > Attachments can have funny letters (š,ć,čćžđ ) in the filename.
> > I have set the file.encoding=UTF8 and UriEncoding = UTF8 for jvm and inside
> > the server.xml.
> > Everything works as expected, no anomalies in displaying the filenames of
> > the uploaded files.
> >
> > Situation 2. - client machine, win server 2003
> > Same webapp as in Situation 1, same tomcat configuration in all matters.
> > But there is aproblem.
> > After i upload the files with funny names through the app, the filenames
> > are
> > scrambled and garbled.
> > I checked the location of the files in the file system, and of course
> > uploadaed filenames are
> > acrambled in the file system too.
> >
> > Obviously there is some other setting i need to check and syncronize, but
> > it
> > eludes me so far..
> >
> > Any help is very appreciated.
> >
> > Tomislav
> >
> >
> > --
> > View this message in context:
> > http://old.nabble.com/--win-xp-and-win-server-2003---tomcat-utf8-encoding-tp31342723p31342723.html
> > Sent from the Tomcat - User mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> > For additional commands, e-mail: users-help@tomcat.apache.org
> >
> >
 		 	   		  

Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by André Warnier <aw...@ice-sa.com>.
Fadil wrote:
> unsubscribe 

----------
          |
          v
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Fadil <fa...@gmail.com>.
unsubscribe

2011/4/7 Tomislav Brkljačić <to...@gmail.com>

>
> Hi to all,
>
> this is my scenario and problem.
>
> Situation 1. - local machine, win xp
> I have a web app deployed to tomcat, and the app has a webform for
> uploading
> attachments.
> Attachments can have funny letters (š,ć,čćžđ ) in the filename.
> I have set the file.encoding=UTF8 and UriEncoding = UTF8 for jvm and inside
> the server.xml.
> Everything works as expected, no anomalies in displaying the filenames of
> the uploaded files.
>
> Situation 2. - client machine, win server 2003
> Same webapp as in Situation 1, same tomcat configuration in all matters.
> But there is  aproblem.
> After i upload the files with funny names through the app, the filenames
> are
> scrambled and garbled.
> I checked the location of the files in the file system, and of course
> uploadaed filenames are
> acrambled in the file system too.
>
> Obviously there is some other setting i need to check and syncronize, but
> it
> eludes me so far..
>
> Any help is very appreciated.
>
> Tomislav
>
>
> --
> View this message in context:
> http://old.nabble.com/--win-xp-and-win-server-2003---tomcat-utf8-encoding-tp31342723p31342723.html
> Sent from the Tomcat - User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom,

On 4/8/2011 11:42 AM, Tomislav Brkljačić wrote:
> The remote machine gives the wrong "result".

Okay. Could you post a LiveHttpHeaders dump of /that/ interaction, too?

> I wrote on the mailing list of the BPM software, the discussion is still
> alive.
> 
> Maybe i could try to force a CharacterEncodingFilter filter on tomcat.
> Something like 
> http://www.onthoo.com/blog/programming/2005/07/characterencodingfilter.html
> this .

Tomcat's examples come with a filter that does exactly this. It's called
SetCharacterEncodingFilter and it can be found in the "examples" webapp.

We always run with such a filter in place because it solves all kinds of
problems with POST requests. My initial reaction was that the headers
are not part of the request body, so the SetCharacterEncodingFilter
wouldn't have an effect, but then again, the request body contains the
multipart/form-data including the headers of each multipart part. This
may solve all your problems.

Let us know how it goes.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2fLtkACgkQ9CaO5/Lv0PBa9QCgqSihhlwnMH4c4nqpN9HP2ACX
iLMAn3B2P5u/qT4ipH6xaR+LbycTJ4gI
=oiLZ
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom,

On 4/12/2011 4:22 AM, Tomislav Brkljačić wrote:
> After that i added the whole spring distro, ran the test scenarios and
> didn't find any problems.

I guess if that works.... I just think it's unnecessary because you can
use a filter from somewhere else (Tomcat, for instance).

> Christopher Schultz-2 wrote:
>>
>> I don't think you want this: you only want to set the encoding when the
>> client has provided none. If the client provides an encoding and you
>> override it, you are probably making a bit mistake.
> 
> I see. 
> The app i'm building will be accessible on intranet only.
> Guessing on what can a client send as attach is not wise, i know.
> 
> Could this issue be handled with a smarter custom filter in place of the
> generic one ?

Just set forceEncoding=false (or don't set it at all: the default
/should/ be not to force the encoding because it's a bad idea).

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2kW/wACgkQ9CaO5/Lv0PBriwCfYsjBk1b5YWGGKLYUDghs4ESW
MmMAniI9+VHADDcznoZy2JWVY6qqsn/x
=wA6L
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Tomislav Brkljačić <to...@gmail.com>.
Cris,


Christopher Schultz-2 wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Tom,
> 
> On 4/9/2011 12:53 PM, Tomislav Brkljačić wrote:
>> I gave the "add the filter and bunch of Spring jars" method a try and it
>> turned out to be a success! 
> 
> You don't need a "bunch of Spring jars"... just use the one that comes
> with Tomcat and be done with it: there's no reason to add unnecessary
> libraries to your webapp.
> 
> 

Well there seemed to be more than 3 dependancies. 

I tried with org.springframework.web-3.1.0.M1.jar only, but it didn't work.
Then i added the *-core.jar", but still problems. Tried adding
"*-beans.jar", 
but still problems with loading.

After that i added the whole spring distro, ran the test scenarios and
didn't find any problems.


Christopher Schultz-2 wrote:
> 
> I don't think you want this: you only want to set the encoding when the
> client has provided none. If the client provides an encoding and you
> override it, you are probably making a bit mistake.
> 
> 

I see. 
The app i'm building will be accessible on intranet only.
Guessing on what can a client send as attach is not wise, i know.

Could this issue be handled with a smarter custom filter in place of the
generic one ?


Christopher Schultz-2 wrote:
> 
>> Andre & Cris, a beer in your name tonight.
> 
> You should send me a Belgian beer and Andre an American one. :)
> 
> (PS there actually are decent American beers)
> 
> - -chris
> 

I've heard of Corsendonk beer as a fine one. Don't know any American beers
(beside B) :)

-- 
View this message in context: http://old.nabble.com/--win-xp-and-win-server-2003---tomcat-utf8-encoding-tp31342723p31377067.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom,

On 4/9/2011 12:53 PM, Tomislav Brkljačić wrote:
> I gave the "add the filter and bunch of Spring jars" method a try and it
> turned out to be a success! 

You don't need a "bunch of Spring jars"... just use the one that comes
with Tomcat and be done with it: there's no reason to add unnecessary
libraries to your webapp.

> Fantastic!
> 
> Filter code :
> <filter>
> <filter-name>characterEncodingFilter</filter-name>
> <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
> <init-param>
> <param-name>forceEncoding</param-name>
> <param-value>true</param-value>

I don't think you want this: you only want to set the encoding when the
client has provided none. If the client provides an encoding and you
override it, you are probably making a bit mistake.

> Andre & Cris, a beer in your name tonight.

You should send me a Belgian beer and Andre an American one. :)

(PS there actually are decent American beers)

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2jFTsACgkQ9CaO5/Lv0PC6egCeIBhQOJecHh1nNt5pwTRgVJ7b
GRoAoIbgwbo4w+4/JCxYGz2Dl7fam888
=ojE2
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Tomislav Brkljačić <to...@gmail.com>.
I gave the "add the filter and bunch of Spring jars" method a try and it
turned out to be a success! 
Fantastic!

Filter code :
<filter>
<filter-name>characterEncodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
</filter>

<filter-mapping>
<filter-name>characterEncodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>

And all the jars from Spring 3.1.0 M distro were copied in the tomcat/lib
folder.

Andre & Cris, a beer in your name tonight.

Cheers! :)




Christopher Schultz-2 wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> André,
> 
> On 4/8/2011 11:50 AM, André Warnier wrote:
>> Tomislav Brkljačić wrote:
>>> The remote machine gives the wrong "result".
>>>
>>> I wrote on the mailing list of the BPM software, the discussion is still
>>> alive.
>>>
>>> Maybe i could try to force a CharacterEncodingFilter filter on tomcat.
>>> Something like
>>> http://www.onthoo.com/blog/programming/2005/07/characterencodingfilter.html
>>>
>>> this .
>> 
>> Don't do that.  Your problem is with the file *name*, not with the file
>> content.
>> Filters work on the content.
>> I think you could make a real mess of everything by adding a content
>> filter.  I don't think that Tomcat would use it in this case, but if it
>> does, it will filter the whole multi-part body (headers and contents),
>> which is certainly not what you want here.
> 
> If the multipart form-handler uses an InputStream to read the request
> body, it won't matter what the character encoding is, anyway. I suspect
> an InputStream will be used, since that is entirely appropriate in this
> case. On the other hand, setting the character encoding might trigger
> the multipart parsing library to use the preferred encoding to translate
> filename bytes into characters. One can dream.
> 
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAk2fMCAACgkQ9CaO5/Lv0PDa8gCfSrZxjxF4vcEcsHkAqFChnYZ4
> nsYAni7LNi0PeGjgGGhxxZadvQOh6QuY
> =VwYO
> -----END PGP SIGNATURE-----
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/--win-xp-and-win-server-2003---tomcat-utf8-encoding-tp31342723p31359969.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

André,

On 4/8/2011 11:50 AM, André Warnier wrote:
> Tomislav Brkljačić wrote:
>> The remote machine gives the wrong "result".
>>
>> I wrote on the mailing list of the BPM software, the discussion is still
>> alive.
>>
>> Maybe i could try to force a CharacterEncodingFilter filter on tomcat.
>> Something like
>> http://www.onthoo.com/blog/programming/2005/07/characterencodingfilter.html
>>
>> this .
> 
> Don't do that.  Your problem is with the file *name*, not with the file
> content.
> Filters work on the content.
> I think you could make a real mess of everything by adding a content
> filter.  I don't think that Tomcat would use it in this case, but if it
> does, it will filter the whole multi-part body (headers and contents),
> which is certainly not what you want here.

If the multipart form-handler uses an InputStream to read the request
body, it won't matter what the character encoding is, anyway. I suspect
an InputStream will be used, since that is entirely appropriate in this
case. On the other hand, setting the character encoding might trigger
the multipart parsing library to use the preferred encoding to translate
filename bytes into characters. One can dream.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2fMCAACgkQ9CaO5/Lv0PDa8gCfSrZxjxF4vcEcsHkAqFChnYZ4
nsYAni7LNi0PeGjgGGhxxZadvQOh6QuY
=VwYO
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by André Warnier <aw...@ice-sa.com>.
Tomislav Brkljačić wrote:
> The remote machine gives the wrong "result".
> 
> I wrote on the mailing list of the BPM software, the discussion is still
> alive.
> 
> Maybe i could try to force a CharacterEncodingFilter filter on tomcat.
> Something like 
> http://www.onthoo.com/blog/programming/2005/07/characterencodingfilter.html
> this .

Don't do that.  Your problem is with the file *name*, not with the file content.
Filters work on the content.
I think you could make a real mess of everything by adding a content filter.  I don't 
think that Tomcat would use it in this case, but if it does, it will filter the whole 
multi-part body (headers and contents), which is certainly not what you want here.

A question : how exactly is the file name retrieved and used by that BPM upload module ?
I mean, can you see if it gets it as a byte array or as a String ?

And what about that "locale=default" query parameter ?
What is it supposed to mean, in the BPM documentation ?


> 
> I will definitely try with Wireshark.
> 
> thx
> 
> 
> Christopher Schultz-2 wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Tom,
>>
>> On 4/8/2011 4:19 AM, Tomislav Brkljačić wrote:
>>> Ok, this is what i did.
>>>
>>> 1. updated the java runtime so they match on both machines
>> Not a bad idea, but probably didn't affect anything.
>>
>>> Tried to run the examples, but still the same result.
>>>
>>> 2. installed livehttpheaders for firefox and ran the examples upload.
>>> This is the output from livehttp  from my local machine (the same is on
>>> the
>>> server machine) :
>> So... is the local machine the one that does or does not work? Comparing
>> the two that DO work would be a good idea.
>>
>>> Content-Type: multipart/form-data;
>>>     boundary=---------------------------55652821543
>> Note the lack of a character encoding (in the main request header). This
>> is appropriate for multipart/form-data content.
>>
>>> Content-Disposition: form-data; name="attach_file";
>>> filename="pričuva.txt"
>>> Content-Type: text/plain
>>>
>>> asdasdasd
>>> -----------------------------55652821543--
>> A couple of things:
>>
>> 1. I'm surprised that no Content-Length was sent along with the file.
>>
>> 2. Note that the filename has non-US-ASCII characters shown there.
>>    I wonder if that's LiveHttpHeaders's interpretation of the header
>>    (and in what encoding) or if that's what's on the wire.
>>
>>
>> I suspect that ff is just using utf-8 to send the filename. Tomcat may
>> interpret it as US-ASCII and give you an odd result. Actually... for
>> multipart, Tomcat shouldn't be involved: this may be a problem with the
>> library you are using for file uploads. You should definitely ask on the
>> BPM mailing list.
>>
>> Here's one thing you can do:
>>
>> String brokenString = part.getFilename();  // or whatever
>>
>> String fixedString
>>    = new String(brokenString.getBytes("US-ASCII"), "UTF-8"));
>>
>> That will re-encode the bytes sent from the client UTF-8. This wil only
>> work if:
>>
>> 1. The client actually sent the data in UTF-8
>>
>> 2. Your multipart handler actually assumed that US-ASCII was correct
>>
>> 3. No alteration of the bytes has occurred by the interpretation
>>    as US-ASCII
>>
>> If any of the above are NOT true, you are basically stuck.
>>
>> It would be worth it to look at the bytes are they are traversing the
>> network -- say, with Wireshark -- to determine whether the filename is
>> actually encoded in UTF-8 or some other encoding.
>>
>> Hope that helps,
>> - -chris
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.10 (MingW32)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>>
>> iEYEARECAAYFAk2fHlAACgkQ9CaO5/Lv0PAJpwCeLrK7QVnL8bEkyfXow8Thj6UD
>> TpEAoJgmtujwwN+VvvCHQzUHZsf9e2qO
>> =9LWc
>> -----END PGP SIGNATURE-----
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Tomislav Brkljačić <to...@gmail.com>.
The remote machine gives the wrong "result".

I wrote on the mailing list of the BPM software, the discussion is still
alive.

Maybe i could try to force a CharacterEncodingFilter filter on tomcat.
Something like 
http://www.onthoo.com/blog/programming/2005/07/characterencodingfilter.html
this .

I will definitely try with Wireshark.

thx


Christopher Schultz-2 wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Tom,
> 
> On 4/8/2011 4:19 AM, Tomislav Brkljačić wrote:
>> Ok, this is what i did.
>> 
>> 1. updated the java runtime so they match on both machines
> 
> Not a bad idea, but probably didn't affect anything.
> 
>> Tried to run the examples, but still the same result.
>> 
>> 2. installed livehttpheaders for firefox and ran the examples upload.
>> This is the output from livehttp  from my local machine (the same is on
>> the
>> server machine) :
> 
> So... is the local machine the one that does or does not work? Comparing
> the two that DO work would be a good idea.
> 
>> Content-Type: multipart/form-data;
>>     boundary=---------------------------55652821543
> 
> Note the lack of a character encoding (in the main request header). This
> is appropriate for multipart/form-data content.
> 
>> Content-Disposition: form-data; name="attach_file";
>> filename="pričuva.txt"
>> Content-Type: text/plain
>> 
>> asdasdasd
>> -----------------------------55652821543--
> 
> A couple of things:
> 
> 1. I'm surprised that no Content-Length was sent along with the file.
> 
> 2. Note that the filename has non-US-ASCII characters shown there.
>    I wonder if that's LiveHttpHeaders's interpretation of the header
>    (and in what encoding) or if that's what's on the wire.
> 
> 
> I suspect that ff is just using utf-8 to send the filename. Tomcat may
> interpret it as US-ASCII and give you an odd result. Actually... for
> multipart, Tomcat shouldn't be involved: this may be a problem with the
> library you are using for file uploads. You should definitely ask on the
> BPM mailing list.
> 
> Here's one thing you can do:
> 
> String brokenString = part.getFilename();  // or whatever
> 
> String fixedString
>    = new String(brokenString.getBytes("US-ASCII"), "UTF-8"));
> 
> That will re-encode the bytes sent from the client UTF-8. This wil only
> work if:
> 
> 1. The client actually sent the data in UTF-8
> 
> 2. Your multipart handler actually assumed that US-ASCII was correct
> 
> 3. No alteration of the bytes has occurred by the interpretation
>    as US-ASCII
> 
> If any of the above are NOT true, you are basically stuck.
> 
> It would be worth it to look at the bytes are they are traversing the
> network -- say, with Wireshark -- to determine whether the filename is
> actually encoded in UTF-8 or some other encoding.
> 
> Hope that helps,
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAk2fHlAACgkQ9CaO5/Lv0PAJpwCeLrK7QVnL8bEkyfXow8Thj6UD
> TpEAoJgmtujwwN+VvvCHQzUHZsf9e2qO
> =9LWc
> -----END PGP SIGNATURE-----
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/--win-xp-and-win-server-2003---tomcat-utf8-encoding-tp31342723p31353009.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tom,

On 4/8/2011 4:19 AM, Tomislav Brkljačić wrote:
> Ok, this is what i did.
> 
> 1. updated the java runtime so they match on both machines

Not a bad idea, but probably didn't affect anything.

> Tried to run the examples, but still the same result.
> 
> 2. installed livehttpheaders for firefox and ran the examples upload.
> This is the output from livehttp  from my local machine (the same is on the
> server machine) :

So... is the local machine the one that does or does not work? Comparing
the two that DO work would be a good idea.

> Content-Type: multipart/form-data;
>     boundary=---------------------------55652821543

Note the lack of a character encoding (in the main request header). This
is appropriate for multipart/form-data content.

> Content-Disposition: form-data; name="attach_file"; filename="pričuva.txt"
> Content-Type: text/plain
> 
> asdasdasd
> -----------------------------55652821543--

A couple of things:

1. I'm surprised that no Content-Length was sent along with the file.

2. Note that the filename has non-US-ASCII characters shown there.
   I wonder if that's LiveHttpHeaders's interpretation of the header
   (and in what encoding) or if that's what's on the wire.


I suspect that ff is just using utf-8 to send the filename. Tomcat may
interpret it as US-ASCII and give you an odd result. Actually... for
multipart, Tomcat shouldn't be involved: this may be a problem with the
library you are using for file uploads. You should definitely ask on the
BPM mailing list.

Here's one thing you can do:

String brokenString = part.getFilename();  // or whatever

String fixedString
   = new String(brokenString.getBytes("US-ASCII"), "UTF-8"));

That will re-encode the bytes sent from the client UTF-8. This wil only
work if:

1. The client actually sent the data in UTF-8

2. Your multipart handler actually assumed that US-ASCII was correct

3. No alteration of the bytes has occurred by the interpretation
   as US-ASCII

If any of the above are NOT true, you are basically stuck.

It would be worth it to look at the bytes are they are traversing the
network -- say, with Wireshark -- to determine whether the filename is
actually encoded in UTF-8 or some other encoding.

Hope that helps,
- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2fHlAACgkQ9CaO5/Lv0PAJpwCeLrK7QVnL8bEkyfXow8Thj6UD
TpEAoJgmtujwwN+VvvCHQzUHZsf9e2qO
=9LWc
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Tomislav Brkljačić <to...@gmail.com>.

awarnier wrote:
> 
> Tomislav Brkljačić wrote:
>> 
>> awarnier wrote:
>>> Tomislav Brkljačić wrote:
>>>> Hi to all,
>>>>
>>>> this is my scenario and problem. 
>>>>
>>>> Situation 1. - local machine, win xp
>>>> I have a web app deployed to tomcat, and the app has a webform for
>>>> uploading
>>>> attachments.
>>>> Attachments can have funny letters (š,ć,čćžđ ) in the filename.
>>>> I have set the file.encoding=UTF8 and UriEncoding = UTF8 for jvm and
>>>> inside
>>>> the server.xml.
>>>> Everything works as expected, no anomalies in displaying the filenames
>>>> of
>>>> the uploaded files.
>>>>
>>>> Situation 2. - client machine, win server 2003
>>>> Same webapp as in Situation 1, same tomcat configuration in all
>>>> matters.
>>>> But there is  aproblem.
>>>> After i upload the files with funny names through the app, the
>>>> filenames
>>>> are
>>>> scrambled and garbled.
>>>> I checked the location of the files in the file system, and of course 
>>>> uploadaed filenames are
>>>> acrambled in the file system too.
>>>>
>>>> Obviously there is some other setting i need to check and syncronize,
>>>> but
>>>> it
>>>> eludes me so far..
>>>>
>>>> Any help is very appreciated.
>>>>
>>> Hi.
>>> Can you provide the *exact* versions of Java, Tomcat, and whichever file
>>> uploading 
>>> mechanism you are using ?
>>> (meaning : to process the multi-part POST with the file upload, your
>>> webapp uses some 
>>> additional mechanism; which is it ?)
>>>
>> 
>> 1.Situation - local win xp machine
>> Java : java version "1.6.0_22"
>> Tomcat : 6.0.29
>> This is the scenario where everything works as expected.
>> 
>> 
>> 2. Situation - customer win server 2003 machine
>> Java : java version "1.6.0_20"
>> Tomcat : 6.0.29
>> 
>> The deployed web application is developed with Bonita open Solution (BPM
>> framework).
>> I'm not that fluent in the java world but looking at the downloaded
>> source
>> code, i guess it
>> would be a basic fileupload servlet. 
>> 
> 
> Right. But that may be the important part.
> Are you familiar with the format in which a browser sends a
> multipart/form-data POST ?
> (MIME multipart, similar to the basic .eml format of an email with
> attachments)
> Briefly : the data is sent by the browser in a format like :
> 
> request line (POST)
> header..
> header..
> header "Content-type: multipart/form-data; boundary="----xyz--"
> ..
> (blank line)
> header of part 1
> (blank line)
> body of part 1
> ----xyz--
> header of part 2
> (blank line)
> body of part 2
> ----xyz--
> etc...
> 
> where each "part" is one of the inputs of the <form>.  One of these parts
> is your uploaded 
> file, and it has a special header which specifies the file type, encoding,
> file name etc..
> 
> The job of the "fileupload servlet" (actually, it is a library capable of
> reading such a 
> POST and separating it into parts), is to read these headers and bodies,
> and make sense 
> out of them.  One of these things that it reads is the filename, and of
> course it 
> interprets that according to some character set.
> For that, it uses some kind of java stream, and if it does it right, tells
> it the 
> character set to use to decode the input.
> And it is possible that it does /not/ do it right in some cases (maybe
> even depending on 
> which JVM version it runs under). For example, if it does not specify the
> character set to 
> use to decode the input, Java may use the platform default, which may be
> different on 
> these two systems.  And if that is the case, it may wrongly decode the
> filename header, 
> and produce garbage.
> 
> What I am saying is that, since you have the same Tomcat version on both
> systems, the code 
> which works differently is unlikely to be in Tomcat itself.  To my
> recollection (maybe 
> wrong), Tomcat 6.0 does not include any code that can deal with a
> multi-part POST.
> (I think that Tomcat 7.0 does).
> So the code which acts differently on your two servers above, is either
> the file-upload 
> library used by that servlet, or the JVM functions that it itself uses.
> 
> In other words again, my first stop for a solution would be whatever
> support list is 
> available for the "Bonita open Solution (BPM framework)".
> 
> (you may further narrow down the problem first by updating your 2003
> server to java 
> version "1.6.0_22").
> 
> Now for another more general comment :
> According to your explanation, you upload a file from a browser, and then
> try to write it 
> to the local filesystem using the name which it had on the original
> workstation.
> In my view, this is always a bad idea, in general.
> One reason is the one you already found.
> The other is that if 2 users upload a file with the same name, the second
> one will 
> overwrite the first.
> The third is that you are leaving yourself open for all kinds of nasty
> things, such as a 
> user uploading a file with spaces in the name (always a problem at some
> point), or with 
> characters in the name that may be very dangerous (think of a file named
> "> /etc/passwd" 
> or "some.file|rm *").
> So, if you have a chance to do that, give each uploaded file a name that
> you create, and 
> keep the original filename in some separate place if you need it, for
> display only.
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 
> 
> 

Ok, this is what i did.

1. updated the java runtime so they match on both machines

Tried to run the examples, but still the same result.

2. installed livehttpheaders for firefox and ran the examples upload.
This is the output from livehttp  from my local machine (the same is on the
server machine) :

http://localhost:8080/Attachment--1.0/application/fileUpload

POST /Attachment--1.0/application/fileUpload HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; hr; rv:1.9.2.16)
Gecko/20110319 Firefox/3.6.16 ( .NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: hr-hr,hr;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer:
http://localhost:8080/Attachment--1.0/application/BonitaApplication.html?locale=default&mode=form&task=Attachment--1.0--19--Step1--it1--mainActivityInstance--noLoop
Cookie: BOS_Locale=en; JSESSIONID=A94AA4DD024666A3B91C26869610AF8E
Content-Type: multipart/form-data;
boundary=---------------------------55652821543
Content-Length: 202
-----------------------------55652821543
Content-Disposition: form-data; name="attach_file"; filename="pričuva.txt"
Content-Type: text/plain

asdasdasd
-----------------------------55652821543--

HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=UTF-8
Content-Length: 88
Date: Fri, 08 Apr 2011 07:54:10 GMT


Regarding the general issues about storing the uploaded files.
The system itself takes care that every logged user gets a different working
directory
which is used for uploading files. 

thx for the effort so far

-- 
View this message in context: http://old.nabble.com/--win-xp-and-win-server-2003---tomcat-utf8-encoding-tp31342723p31349626.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

André,

On 4/7/2011 5:15 PM, André Warnier wrote:
> Christopher Schultz wrote:
> ... (RFC references) ..
> 
> Thanks for that post (with the chain of applicable RFCs).  I will keep
> that email preciously as a resource for future file upload debugging
> references.

You could always update the Tomcat "Character Encoding" page. I have a
headache from reading those specs so I'm not going to do it right now :)

> Also, to add to the potential OP woes, there is also the fact that some
> browsers send the filename, and others send the full path of the file.

I love it when a standard gets followed. Do you happen to know which
browsers send the files in which format? Most OSs these days use either
\ or / as a path delimiter, so you can take everything after the last
one of those... but what is someone is using Firefox 4.0 on VMS (ha ha ha)?

> But it /may/ still be a problem if, after uploading the file and duly
> writing it into a directory, that directory is then later scanned by
> some separate (non-Java) program or script (whatever language it may be
> written in, even, God forbid, perl) with the purpose of actually doing
> something with these files.

Good point. Always good to quote your filenames :)

> There may be a lot of potential there :
> 
> for ff in /mydir/* ; do
>   mv "$ff" "/otherdir/${ff}.new"
> done

:)

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2fHE8ACgkQ9CaO5/Lv0PDqvACfXYAyO3jUtppEbPmW/pqCi71x
jv4Anik39tH/ir2Gw8ah+uGAeAg473or
=bUn3
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by André Warnier <aw...@ice-sa.com>.
Christopher Schultz wrote:
... (RFC references) ..

Thanks for that post (with the chain of applicable RFCs).  I will keep that email 
preciously as a resource for future file upload debugging references.
...

Also, to add to the potential OP woes, there is also the fact that some browsers send the 
filename, and others send the full path of the file.

> 
> I would hope that the OP was putting these files in some known root, so
> that uploading /etc/passwd wouldn't overwrite /etc/passwd,
(I wrote "> /etc/passwd" as the filename)

  and that file
> permissions wouldn't allow this, either. Also, unlike Perl, having a
> pipe in a filename isn't a problem in Java :)
> 
But it /may/ still be a problem if, after uploading the file and duly writing it into a 
directory, that directory is then later scanned by some separate (non-Java) program or 
script (whatever language it may be written in, even, God forbid, perl) with the purpose 
of actually doing something with these files.

There may be a lot of potential there :

for ff in /mydir/* ; do
   mv "$ff" "/otherdir/${ff}.new"
done


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

André,

On 4/7/2011 12:26 PM, André Warnier wrote:
> What I am saying is that, since you have the same Tomcat version on both
> systems, the code which works differently is unlikely to be in Tomcat
> itself.  To my recollection (maybe wrong), Tomcat 6.0 does not include
> any code that can deal with a multi-part POST.
> (I think that Tomcat 7.0 does).

Correct: Tomcat 6 does not include any multipart-parsing code, while
Tomcat 7 does, since it implements the Servlet 3.0 Multipart upload
features.

Note that the URIEncoding setting on your <Connector> is not relevant,
since the filename is being read from the request /body/ and not from
the URI.

I would use Fiddler(2?), LiveHttHeaders, FireBug, etc. to see if there
is a difference on the /client/ side between these two situations. If
the client is sending a different Content-Type, then things may go wrong.

Here's the problem.... the ARPA Internet Text Messages standard (from
which HTTP et al descend) doesn't say how to encode message headers that
include non-US-ASCII characters. This includes filenames with
non-US-ASCII characters that are embedded in the

The W3C says this (http://www.w3.org/TR/html401/interact/forms.html) in
section 17.13.4:

"
The user agent should attempt to supply a file name for each submitted
file. The file name may be specified with the "filename" parameter of
the 'Content-Disposition: form-data' header, or, in the case of multiple
files, in a 'Content-Disposition: file' header of the subpart. If the
file name of the client's operating system is not in US-ASCII, the file
name might be approximated or encoded using the method of [RFC2045].
This is convenient for those cases where, for example, the uploaded
files might contain references to each other (e.g., a TeX file and its
".sty" auxiliary style description).
"

So, the user agent /might/ do something? Not very encouraging. RFC 2045
says virtually nothing, but there is an RFC specifically covering the
Content-Disposition header: http://www.ietf.org/rfc/rfc2183.txt

If you follow everything, you can piece together the following:

- From http://www.ietf.org/rfc/rfc822.txt:
     CHAR        =  <any ASCII character>        ; (  0-177,  0.-127.)

     quoted-string = <"> *(qtext/quoted-pair) <">; Regular qtext or
                                                 ;   quoted chars.

     qtext       =  <any CHAR excepting <">,     ; => may be folded
                     "\" & CR, and including
                     linear-white-space>

     quoted-pair =  "\" CHAR                     ; may quote any char

- From http://www.ietf.org/rfc/rfc2045.txt:

     value := token / quoted-string

- From http://www.ietf.org/rfc/rfc2183.txt:

     filename-parm := "filename" "=" value

So, the filename value can be a quoted string made up of any ASCII
value. Great. What about non-US-ASCII characters?

RFC 2183 says this in section 2:

"
   NOTE ON PARAMETER VALUE LENGHTS: A short (length <= 78 characters)
   parameter value containing only non-`tspecials' characters SHOULD be
   represented as a single `token'.  A short parameter value containing
   only ASCII characters, but including `tspecials' characters, SHOULD
   be represented as `quoted-string'.  Parameter values longer than 78
   characters, or which contain non-ASCII characters, MUST be encoded as
   specified in [RFC 2184].
"

Great: another RFC to read. At least this one deals with the proper way
to communicate the character encoding used for a parameter value.

I think this all comes down to two things:

1. How standards-compliant is your user-agent (most aren't very good)

2. How standards-compliant is your file upload library (or servlet
container).

I've forgotten whether or not Tomcat includes RFC2184-style header
decoding logic... I'll have to check. But it doesn't matter if your
user-agent (=browser) sends the information in a non-standard way.

Can you provide some header captures so we can see what's going on?

> Now for another more general comment :
> According to your explanation, you upload a file from a browser, and
> then try to write it to the local filesystem using the name which it had
> on the original workstation.
> In my view, this is always a bad idea, in general.

+1

> One reason is the one you already found.
> The other is that if 2 users upload a file with the same name, the
> second one will overwrite the first.

+1

> The third is that you are leaving yourself open for all kinds of nasty
> things, such as a user uploading a file with spaces in the name (always
> a problem at some point), or with characters in the name that may be
> very dangerous (think of a file named "> /etc/passwd" or "some.file|rm *").

I would hope that the OP was putting these files in some known root, so
that uploading /etc/passwd wouldn't overwrite /etc/passwd, and that file
permissions wouldn't allow this, either. Also, unlike Perl, having a
pipe in a filename isn't a problem in Java :)

The user can cause some other problems like uploading a file to a win32
server with a filename of "PRN", "LPT1", "COM1", etc. and causing
weirdness, there.

> So, if you have a chance to do that, give each uploaded file a name that
> you create, and keep the original filename in some separate place if you
> need it, for display only.

+1

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2eGw0ACgkQ9CaO5/Lv0PDqfQCfStHjz3X9NNxD6CgDvZbKowZp
oMkAniXQ3yfLvol8jc9xGxN72+ml//sy
=bwBm
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by André Warnier <aw...@ice-sa.com>.
Tomislav Brkljačić wrote:
> 
> awarnier wrote:
>> Tomislav Brkljačić wrote:
>>> Hi to all,
>>>
>>> this is my scenario and problem. 
>>>
>>> Situation 1. - local machine, win xp
>>> I have a web app deployed to tomcat, and the app has a webform for
>>> uploading
>>> attachments.
>>> Attachments can have funny letters (š,ć,čćžđ ) in the filename.
>>> I have set the file.encoding=UTF8 and UriEncoding = UTF8 for jvm and
>>> inside
>>> the server.xml.
>>> Everything works as expected, no anomalies in displaying the filenames of
>>> the uploaded files.
>>>
>>> Situation 2. - client machine, win server 2003
>>> Same webapp as in Situation 1, same tomcat configuration in all matters.
>>> But there is  aproblem.
>>> After i upload the files with funny names through the app, the filenames
>>> are
>>> scrambled and garbled.
>>> I checked the location of the files in the file system, and of course 
>>> uploadaed filenames are
>>> acrambled in the file system too.
>>>
>>> Obviously there is some other setting i need to check and syncronize, but
>>> it
>>> eludes me so far..
>>>
>>> Any help is very appreciated.
>>>
>> Hi.
>> Can you provide the *exact* versions of Java, Tomcat, and whichever file
>> uploading 
>> mechanism you are using ?
>> (meaning : to process the multi-part POST with the file upload, your
>> webapp uses some 
>> additional mechanism; which is it ?)
>>
> 
> 1.Situation - local win xp machine
> Java : java version "1.6.0_22"
> Tomcat : 6.0.29
> This is the scenario where everything works as expected.
> 
> 
> 2. Situation - customer win server 2003 machine
> Java : java version "1.6.0_20"
> Tomcat : 6.0.29
> 
> The deployed web application is developed with Bonita open Solution (BPM
> framework).
> I'm not that fluent in the java world but looking at the downloaded source
> code, i guess it
> would be a basic fileupload servlet. 
> 

Right. But that may be the important part.
Are you familiar with the format in which a browser sends a multipart/form-data POST ?
(MIME multipart, similar to the basic .eml format of an email with attachments)
Briefly : the data is sent by the browser in a format like :

request line (POST)
header..
header..
header "Content-type: multipart/form-data; boundary="----xyz--"
..
(blank line)
header of part 1
(blank line)
body of part 1
----xyz--
header of part 2
(blank line)
body of part 2
----xyz--
etc...

where each "part" is one of the inputs of the <form>.  One of these parts is your uploaded 
file, and it has a special header which specifies the file type, encoding, file name etc..

The job of the "fileupload servlet" (actually, it is a library capable of reading such a 
POST and separating it into parts), is to read these headers and bodies, and make sense 
out of them.  One of these things that it reads is the filename, and of course it 
interprets that according to some character set.
For that, it uses some kind of java stream, and if it does it right, tells it the 
character set to use to decode the input.
And it is possible that it does /not/ do it right in some cases (maybe even depending on 
which JVM version it runs under). For example, if it does not specify the character set to 
use to decode the input, Java may use the platform default, which may be different on 
these two systems.  And if that is the case, it may wrongly decode the filename header, 
and produce garbage.

What I am saying is that, since you have the same Tomcat version on both systems, the code 
which works differently is unlikely to be in Tomcat itself.  To my recollection (maybe 
wrong), Tomcat 6.0 does not include any code that can deal with a multi-part POST.
(I think that Tomcat 7.0 does).
So the code which acts differently on your two servers above, is either the file-upload 
library used by that servlet, or the JVM functions that it itself uses.

In other words again, my first stop for a solution would be whatever support list is 
available for the "Bonita open Solution (BPM framework)".

(you may further narrow down the problem first by updating your 2003 server to java 
version "1.6.0_22").

Now for another more general comment :
According to your explanation, you upload a file from a browser, and then try to write it 
to the local filesystem using the name which it had on the original workstation.
In my view, this is always a bad idea, in general.
One reason is the one you already found.
The other is that if 2 users upload a file with the same name, the second one will 
overwrite the first.
The third is that you are leaving yourself open for all kinds of nasty things, such as a 
user uploading a file with spaces in the name (always a problem at some point), or with 
characters in the name that may be very dangerous (think of a file named "> /etc/passwd" 
or "some.file|rm *").
So, if you have a chance to do that, give each uploaded file a name that you create, and 
keep the original filename in some separate place if you need it, for display only.





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by Tomislav Brkljačić <to...@gmail.com>.

awarnier wrote:
> 
> Tomislav Brkljačić wrote:
>> Hi to all,
>> 
>> this is my scenario and problem. 
>> 
>> Situation 1. - local machine, win xp
>> I have a web app deployed to tomcat, and the app has a webform for
>> uploading
>> attachments.
>> Attachments can have funny letters (š,ć,čćžđ ) in the filename.
>> I have set the file.encoding=UTF8 and UriEncoding = UTF8 for jvm and
>> inside
>> the server.xml.
>> Everything works as expected, no anomalies in displaying the filenames of
>> the uploaded files.
>> 
>> Situation 2. - client machine, win server 2003
>> Same webapp as in Situation 1, same tomcat configuration in all matters.
>> But there is  aproblem.
>> After i upload the files with funny names through the app, the filenames
>> are
>> scrambled and garbled.
>> I checked the location of the files in the file system, and of course 
>> uploadaed filenames are
>> acrambled in the file system too.
>> 
>> Obviously there is some other setting i need to check and syncronize, but
>> it
>> eludes me so far..
>> 
>> Any help is very appreciated.
>> 
> Hi.
> Can you provide the *exact* versions of Java, Tomcat, and whichever file
> uploading 
> mechanism you are using ?
> (meaning : to process the multi-part POST with the file upload, your
> webapp uses some 
> additional mechanism; which is it ?)
> 

1.Situation - local win xp machine
Java : java version "1.6.0_22"
Tomcat : 6.0.29
This is the scenario where everything works as expected.


2. Situation - customer win server 2003 machine
Java : java version "1.6.0_20"
Tomcat : 6.0.29

The deployed web application is developed with Bonita open Solution (BPM
framework).
I'm not that fluent in the java world but looking at the downloaded source
code, i guess it
would be a basic fileupload servlet. 

thx 
-- 
View this message in context: http://old.nabble.com/--win-xp-and-win-server-2003---tomcat-utf8-encoding-tp31342723p31343818.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [ win xp and win server 2003 ] tomcat utf8 encoding

Posted by André Warnier <aw...@ice-sa.com>.
Tomislav Brkljačić wrote:
> Hi to all,
> 
> this is my scenario and problem. 
> 
> Situation 1. - local machine, win xp
> I have a web app deployed to tomcat, and the app has a webform for uploading
> attachments.
> Attachments can have funny letters (š,ć,čćžđ ) in the filename.
> I have set the file.encoding=UTF8 and UriEncoding = UTF8 for jvm and inside
> the server.xml.
> Everything works as expected, no anomalies in displaying the filenames of
> the uploaded files.
> 
> Situation 2. - client machine, win server 2003
> Same webapp as in Situation 1, same tomcat configuration in all matters.
> But there is  aproblem.
> After i upload the files with funny names through the app, the filenames are
> scrambled and garbled.
> I checked the location of the files in the file system, and of course 
> uploadaed filenames are
> acrambled in the file system too.
> 
> Obviously there is some other setting i need to check and syncronize, but it
> eludes me so far..
> 
> Any help is very appreciated.
> 
Hi.
Can you provide the *exact* versions of Java, Tomcat, and whichever file uploading 
mechanism you are using ?
(meaning : to process the multi-part POST with the file upload, your webapp uses some 
additional mechanism; which is it ?)


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org