You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Raphaël Piéroni <ra...@yahoo.fr> on 2003/10/21 17:04:41 UTC

[file upload] bug in the decoding ?

hello,

i have found a bug in the implementation of the
jakarta.commons.fileupload library.

in the rfc 2046 (http://www.pro-net.co.uk/site/mod_rfc/24/rfc/2046)
in the section "5.1.1 Common Syntax" is indicated the copied text :

"
WARNING TO IMPLEMENTORS:  The grammar for parameters on the Content-
   type field is such that it is often necessary to enclose the
boundary
   parameter values in quotes on the Content-type line.  This is not
   always necessary, but never hurts. Implementors should be sure to
   study the grammar carefully in order to avoid producing invalid
   Content-type fields.  Thus, a typical "multipart" Content-Type
header
   field might look like this:

     Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08j34c0p

   But the following is not valid:

     Content-Type: multipart/mixed; boundary=gc0pJq0M:08jU534c0p

   (because of the colon) and must instead be represented as

     Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p"

   This Content-Type value indicates that the content consists of one
or
   more parts, each with a structure that is syntactically identical to
   an RFC 822 message, except that the header area is allowed to be
   completely empty, and that the parts are each preceded by the line

 --gc0pJq0M:08jU534c0p

   The boundary delimiter MUST occur at the beginning of a line, i.e.,
   following a CRLF, and the initial CRLF is considered to be attached
   to the boundary delimiter line rather than part of the preceding
   part.  The boundary may be followed by zero or more characters of
   linear whitespace. It is then terminated by either another CRLF and
   the header fields for the next part, or by two CRLFs, in which case
   there are no header fields for the next part.  If no Content-Type
   field is present it is assumed to be "message/rfc822" in a
   "multipart/digest" and "text/plain" otherwise.
"

but in the FileUploadBase class you can see the following code :
"
324             int boundaryIndex = contentType.indexOf("boundary=");
325             if (boundaryIndex < 0)
326             {
327                 throw new FileUploadException(
328                         "the request was rejected because "
329                         + "no multipart boundary was found");
330             }
331             byte[] boundary = contentType.substring(
332                     boundaryIndex + 9).getBytes();
333 
334             InputStream input = req.getInputStream();
335 
336             MultipartStream multi = new MultipartStream(input,
boundary);
"

especially the lines 331 and 332 seems to ignore completely that point.

do i have to post a bug in the bugzilla ?
do i have to propose a patch ? 

the patch : replace the 331/332 lines by :
""""""""
byte[] boundary =
if (contentType.charAt(boundaryIndex + 9) == '"' &&
    contentType.charAt(contentType.length() - 1) == '"') 
{
boundary = contentType.substring(
     boundaryIndex + 10, contentType.length() - 2).getBytes();
} else {
boundary = contentType.substring(boundaryIndex + 9).getBytes();

}

___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en français !
Yahoo! Mail : http://fr.mail.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org