You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2003/06/04 17:06:34 UTC
DO NOT REPLY [Bug 20060] -
[PATCH] HDF text extraction patch
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=20060>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=20060
[PATCH] HDF text extraction patch
------- Additional Comments From thierry.guerin@prima-solutions.com 2003-06-04 15:06 -------
I've been working on the exact same thing, and I came up with different fixes
that lead to the same result, but without having to remove
the "findFormatting" from the WordDocument class. I now have merged Serge's
patch with mine. The differences between Serge's modifications and mine are:
Utils.convertBytesToShort: patch to avoid an ArrayOutOfBoundsExceptions.
WordDocument.printTable: patch to avoid a NullPointerException
As of now, the only word documents that refuse to parse are the ones that
throw the "Invalid header signature" error (see bug 11506 for the files). I
may look into this in the future, but for now have no time to do so.
Following this message you will find the resulting CVS Diff.
Please bear in mind that my modifications, though working, are based only on
fixes that seemed logical from a programming point of view (tests to avoid
ArrayOutOfBoundsExceptions, etc..). I have _no_ knowledge of the Word file
format and in the process might have done something stupid.