You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Jaime Hablutzel (JIRA)" <ji...@apache.org> on 2013/03/08 16:34:12 UTC

[jira] [Commented] (FILEUPLOAD-56) [FileUpload] uploading files with non-ASCII filenames

    [ https://issues.apache.org/jira/browse/FILEUPLOAD-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597224#comment-13597224 ] 

Jaime Hablutzel commented on FILEUPLOAD-56:
-------------------------------------------

Haruyi Kawabe I have realized this too, but you should consider that they are not implementing rfc2388 (which references  rfc2231), commons fileupload claims to implement rfc1867 as the home page says:

bq. FileUpload parses HTTP requests which conform to RFC 1867, "Form-based File Upload in HTML". That is, if an HTTP request is submitted using the POST method, and with a content type of "multipart/form-data", then FileUpload can parse that request, and make the results available in a manner easily used by the caller.

Anyway rfc1867 references rfc1522 for encoding 'filename' parameter specifically (in 3.3):


bq. The original local file name may be supplied as well, either as a 'filename' parameter either of the 'content-disposition: form-data' header or in the case of multiple files in a 'content-disposition: file' header of the subpart. The client application should make best effort to supply the file name; if the file name of the client's operating system is not in US-ASCII, the file name might be approximated or encoded using the method of RFC 1522. This is a convenience for those cases where, for example, the uploaded files might contain references to each other, e.g., a TeX file and its .sty auxiliary style description.

But, instead commons fileupload is just using a headerEncoding got from the http request, I don't even understand why are they doing this. 

Anyway they aren't the only doing this, Firefox (from a multipart/form-data) sends 'filename' parameter encoded with the page encoding (or form accept-charset), and nothing about rfc1522 or rfc2231, I have asked in commons fileupload user mailing list and firefox mailing list waiting for an explanation, asked in a firefox support forum too: 

https://support.mozilla.org/en-US/questions/952827

I think that maybe fixing commons fileupload could bring some interoperability problems, because browsers (at least firefox) seems to be sending the wrong thing according to specs. Or, maybe supporting the spec and if encoded-words encoding not found falling back to http request content type charset... not sure, anyway
                
> [FileUpload] uploading files with non-ASCII filenames
> -----------------------------------------------------
>
>                 Key: FILEUPLOAD-56
>                 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-56
>             Project: Commons FileUpload
>          Issue Type: Bug
>    Affects Versions: Nightly Builds
>         Environment: Operating System: All
> Platform: All
>            Reporter: joachim.gjesdal
>         Attachments: ASF.LICENSE.NOT.GRANTED--encoding.diff, ASF.LICENSE.NOT.GRANTED--fileupload_filenamencoding, ASF.LICENSE.NOT.GRANTED--FileUpload.java, ASF.LICENSE.NOT.GRANTED--fileupload.txt, ASF.LICENSE.NOT.GRANTED--MultipartStream.java, ASF.LICENSE.NOT.GRANTED--MultipartStream.java.patch
>
>
> FileUpload does not encode the filename. Uploading files like münchen.jpeg
> results in a bogus file names.
> (added this to Sandbox component since FileUpload not listed)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira