You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@struts.apache.org by Mark Takacs <ta...@coscend.com> on 2001/08/04 01:43:14 UTC
File Upload - \n inserts in DiskMultiPart
We ran into this bug when using Struts uploads on a Microsoft Excel
spreadsheet that we are trying to run a conversion utility (xlHtml) on.
The Excel conversion was failing when done via upload, but not when
done via commmand line. After some sleuthing, it looks like struts is
clobbering the last char of the web.xml bufferSize with a "\n" char.
I added some details to an existing bug covering this problem.
-tak
------
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=2503
Using struts to handle uploaded files, if the files contain lines > 4k (or the
file is binary), the file data gets \n characters inserted at the 4k boundaries
of the long lines.
/------- Additional Comments From Tak <ma...@pacbell.net>
2001-08-03 16:33 -------/
This seems very clear-cut. At least the error, in any case. I'm looking at the
1.0 final codebase.
The 4096 limit comes directly from the (default) value in web.xml
<init-param>
<param-name>bufferSize</param-name>
<param-value>4096</param-value>
</init-param>
Uploading a (binary) file and doing a cmp results in the following:
cmp -l /tmp/strts27203.tmp ~/myBinaryFileTest.xls
Byte# Oct Oct
4097 12 57
8194 12 144
12291 12 142
16388 12 145
20485 12 156
24582 12 147
28679 12 55
71584 12 0
This sez that every 4096 bytes, a linefeed (Octal 12) is being inserted instead
of the various original data (last column)
upload.DiskMultipartRequestHandler.java seems like a good place to start. It pulls the
value (4096) out of the config file and (?) cuts up the input into bufferSize'd
chunks which are stored in a Hashtable.
Whatever is writing that hashtable back to disk is replacing the last char of
each hashtable value with a \n and writing the file out. Or maybe the First char of the next block?
I'm hoping that bumping the config file value of bufferSize is an acceptable
workaround for now...
--------
Re: File Upload - \n inserts in DiskMultiPart
Posted by Mark Takacs <ta...@coscend.com>.
I found an (ugly) workaround for this problem.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=2503
Index: MultipartIterator.java
===================================================================
RCS file:
/home/cvspublic/jakarta-struts/src/share/org/apache/struts/upload/MultipartIterator.java,v
retrieving revision 1.13.2.1
diff -c -2 -r1.13.2.1 MultipartIterator.java
*** MultipartIterator.java 2001/06/14 01:11:28 1.13.2.1
--- MultipartIterator.java 2001/08/06 18:53:06
***************
*** 34,40 ****
/**
! * The maximum size in bytes of the buffer used to read lines [4K]
*/
! public static int MAX_LINE_SIZE = 4096;
/**
--- 34,40 ----
/**
! * The maximum size in bytes of the buffer used to read lines [64K]
*/
! public static int MAX_LINE_SIZE = 65536;
/**
This isn't really a fix, as it just avoids the buggy codepath in
MultipartIterator.createLocalFile(). Here's the comment I put in our
local (hacked) version.
+ // The code for this cutCarriage and cutNewline
+ // mangles binary files where the "line length" is greater
+ // than the MAX_LINE_SIZE. Note that not all binary files
+ // have lines longer than MAX_LINE_SIZE -- most jpgs (for
+ // instance) consist of many small 'lines' of binary data,
+ // which avoids the cutXX codepath. Microsoft Excel
+ // spreadsheets, on the other hand, contain HUGE single line
+ // blobs of data (29k), triggering the strange cuts.
+ //
+ // http://nagoya.apache.org/bugzilla/show_bug.cgi?id=2503
If someone a bit more parser-savvy could take a look at the block of
parser code that sniffs out newlines and tries to remove/add them, that
would be a real fix. Do you even need to do that for binary files?
Maybe curCarriage/cutNewline should be completely skipped for binary files..
-tak