You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2018/03/20 21:51:47 UTC

[Bug 62201] New: Zip Bomb ratio: Fail fast and/or round the ratio before comparison

https://bz.apache.org/bugzilla/show_bug.cgi?id=62201

            Bug ID: 62201
           Summary: Zip Bomb ratio: Fail fast and/or round the ratio
                    before comparison
           Product: POI
           Version: unspecified
          Hardware: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XSSF
          Assignee: dev@poi.apache.org
          Reporter: grapeguy@gmail.com
  Target Milestone: ---

Created attachment 35793
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35793&action=edit
Stack trace from zip bomb failure

I updated from Poi 3.9 to 3.16. My customers are now reporting issues with
inability to read Excel file due to the zip bomb fix. In all instances, our
software is able to open the file (new XSSFReader(OPCPackage.open(file))) and
read the sheets
(XSSFReader.getSheetsData.asInstanceOf[XSSFReader.SheetIterator]) without
receiving an exception. 

The "Zip Bomb" exception appears to occur when we attempt to call
XSSFReader.getStylesTable(). 

The ultimate error message: 
>java.io.IOException: Zip bomb detected! The file would exceed the max. ratio of compressed file size to the size of the expanded data. This may indicate that the file is used to inflate memory usage and thus could pose a security risk. You can adjust this limit via ZipSecureFile.setMinInflateRatio() if you need to work with files which exceed this limit. Counter: 3277797, cis.counter: 32768, ratio: 0.009996958322922377 Limits: MIN_INFLATE_RATIO: 0.01

Options: 
1. Fail faster? 
It would be beneficial to fail with the Zip Bomb error when instantiating the
XSSFReader through a validation step rather than waiting until we try to access
some of the embedded data? This is certainly an unexpected issue since our
upgrade to 3.16.

2. Round the calculated compression ratio
In every customer-reported situation we've identified the following case:
ratio: 0.00999.... Limits: MIN_INFLATE_RATIO: 0.01 
Should we be rounding the calculated compression ratio (0.0099...) to 2 decimal
places? Would that reduce the security of this? 

Exception: attached

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 62201] Zip Bomb ratio: Fail fast and/or round the ratio before comparison

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62201

--- Comment #2 from Dominik Stadler <do...@gmx.at> ---
1. This would cause performance issues as we would need to read and parse every
embedded item, for some files this could be substantial. Probably best if you
use a small wrapper function where you do this if it is useful for you.

2. We continually compute the compression ratio while uncompressing the stream
and stop as soon as the ratio goes beyond the limit. So you always will see a
number close to the limit here. We do not read the whole stream and thus do not
compute the actual ratio of the whole file before failing as it is the whole
point of this to not read malicious files completely.

We can try to improve the error message to make this a bit clearer.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 62201] Zip Bomb ratio: Fail fast and/or round the ratio before comparison

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=62201

Andreas Beeker <ki...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All

--- Comment #1 from Andreas Beeker <ki...@apache.org> ---
I wonder where the "duplicated" styles come from, i.e. there must be a reason
why the compression ratio is so high - are the styles in that excel file
generated? ... and maybe a new style is created for each cell?

at option 1.: I don't think that eagerly fetching shared strings and styles
table is a good idea - I actually would prefer the opposite ... something like
a lazy-loading mechanism inside of the content, e.g. a table which successively
fills when it's elements are iterated over

at option 2. I don't get the rounding advantage - how about setting the limit
to 0.005?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org