You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2017/07/13 19:31:51 UTC

[Bug 61295] New: Vector.read -- Java heap space on corrupt file

https://bz.apache.org/bugzilla/show_bug.cgi?id=61295

            Bug ID: 61295
           Summary: Vector.read -- Java heap space on corrupt file
           Product: POI
           Version: 3.16-FINAL
          Hardware: PC
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HPSF
          Assignee: dev@poi.apache.org
          Reporter: tallison@mitre.org
  Target Milestone: ---

Created attachment 35128
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35128&action=edit
triggering file

I started experimenting with randomly corrupting files based on feedback from
Luis Filipe Nassif [1].  The attached file triggers this:

java.lang.OutOfMemoryError: Java heap space

        at org.apache.poi.hpsf.Vector.read(Vector.java:43)
        at
org.apache.poi.hpsf.TypedPropertyValue.readValue(TypedPropertyValue.java:219)
        at org.apache.poi.hpsf.VariantSupport.read(VariantSupport.java:174)
        at org.apache.poi.hpsf.Property.<init>(Property.java:179)
        at org.apache.poi.hpsf.MutableProperty.<init>(MutableProperty.java:53)
        at org.apache.poi.hpsf.Section.<init>(Section.java:237)
        at org.apache.poi.hpsf.MutableSection.<init>(MutableSection.java:41)
        at org.apache.poi.hpsf.PropertySet.init(PropertySet.java:494)
        at org.apache.poi.hpsf.PropertySet.<init>(PropertySet.java:196)
        at
org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:83)
        at
org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:74)
        at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:155)
        at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131)


[1]
https://issues.apache.org/jira/browse/TIKA-2428?focusedCommentId=16086045&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16086045

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 61295] Vector.read -- Java heap space on corrupt file

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61295

Dominik Stadler <do...@gmx.at> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 61295] Vector.read -- Java heap space on corrupt file

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61295

Tim Allison <ta...@mitre.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #2 from Tim Allison <ta...@mitre.org> ---
r1802879

I didn't add a test file because I didn't think the test was worth 65kb.

I can look for a shorter triggering file if necessary.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org

[Bug 61295] Vector.read -- Java heap space on corrupt file

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=61295

--- Comment #1 from Tim Allison <ta...@mitre.org> ---
The actual Vector size that is causing an OOM in Tika is 1,358,954,497 on one
triggering file.  We could arbitrarily set a max_value << Integer.MAX_VALUE, or
we could use a list and then convert that to an array.  If we do the latter,
and there is a corrupt size value, the LittleEndianInputStream will throw an
exception when asked to read beyond what is available in the stream.

I somewhat prefer the second option.  Commit on way...

Happy to go with the first or open to other options...

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org