You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2008/04/23 09:19:05 UTC

DO NOT REPLY [Bug 44857] New: Problem parsing Esc

https://issues.apache.org/bugzilla/show_bug.cgi?id=44857

           Summary: Problem parsing Esc
           Product: POI
           Version: 3.0
          Platform: PC
        OS/Version: Windows Vista
            Status: NEW
          Severity: major
          Priority: P2
         Component: HSSF
        AssignedTo: dev@poi.apache.org
        ReportedBy: trejkaz@trypticon.org


There is a particular test Excel file which we have unit tests for, which isn't
working in POI 3.0.1 or 3.0.2 (although it's working in our custom 3.0.1
branch, I can't figure out why.)

The file itself is complicated but I have managed to 
Simple test exhibiting the problem:

    public void testEscher() throws Exception
    {
        byte[] data = FileUtils.readFileToByteArray(new
File("D:\\temp\\container.dat"));
        EscherContainerRecord record = new EscherContainerRecord();
        record.fillFields(data, 0, new DefaultEscherRecordFactory());
    }

This throws:

java.lang.OutOfMemoryError: Java heap space
        at
org.apache.poi.ddf.UnknownEscherRecord.fillFields(UnknownEscherRecord.java:76)
        at
org.apache.poi.ddf.EscherContainerRecord.fillFields(EscherContainerRecord.java:56)
        at
org.apache.poi.ddf.EscherContainerRecord.fillFields(EscherContainerRecord.java:56)


I tracked it down to an EscherMetafileBlip underneath EscherBSERecord. 
EscherBSERecord is assuming that getRecordSize() will be consistent with its
own bytesRemaining value and this is not the case -- there are supposed to be
1125 bytes after the header but field_5_cbSave is only 163.

But from there I can't say whether it's a trivial fix or not.  The code in our
real unit test asserts an MD5 for the uncompressed metafile -- if I rewrite
EscherMetafileBlip to read the whole thing then it avoids the exception but the
MD5 still fails.  Problem is, I don't know whether the MD5 was wrong the whole
time, due to some other obscure bug.

Someone who knows more about EscherMetafileBlip would probably be able to say
whether the simple and obvious fix is applicable here.


-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 44857] Problem parsing Escher records, OutOfMemoryError from UnknownEscherRecord.fillFields

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=44857


Trejkaz <tr...@trypticon.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Problem parsing Esc         |Problem parsing Escher
                   |                            |records, OutOfMemoryError
                   |                            |from
                   |                            |UnknownEscherRecord.fillFiel
                   |                            |ds




-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 44857] Problem parsing Esc

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=44857





--- Comment #1 from Trejkaz <tr...@trypticon.org>  2008-04-23 00:19:53 PST ---
Created an attachment (id=21846)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=21846)
container.dat

Here's the container record by itself, should be better for testing as the real
thing contains quite a bit more rubbish...


-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 44857] Problem parsing Escher records, OutOfMemoryError from UnknownEscherRecord.fillFields

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=44857


Nick Burch <ni...@torchbox.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




--- Comment #3 from Nick Burch <ni...@torchbox.com>  2008-04-27 10:59:24 PST ---
Thanks for the patch, file and testcase, patch applied to trunk

In terms of what has changed, "svn log" and "svn blame" are probably your
friends here :)


-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


DO NOT REPLY [Bug 44857] Problem parsing Esc

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=44857





--- Comment #2 from Trejkaz <tr...@trypticon.org>  2008-04-23 00:27:17 PST ---
Created an attachment (id=21847)
 --> (https://issues.apache.org/bugzilla/attachment.cgi?id=21847)
proposed fix, but probably dodgy

Attaching proposed fix.  Results in consistency, but like I said my MD5 from
before is different. :-/

It could be that my previous test was wrong.

But why would it declare the stored size as only 10% of the available space? 
It's almost as if the thing isn't actually compressed and yet the size field is
still recording the compressed size, but that would be ludicrous.  Maybe all
the extra space is simply padding?

And the real mystery, how could our local branch of 3.0.1 work, when 3.0.1
itself does not, and yet this file has not changed?  Did the record reading
code previously use the bytesRemaining return value instead of getRecordSize()?


-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org