You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2023/08/03 14:15:48 UTC

[Bug 66840] New: zip attack

https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

            Bug ID: 66840
           Summary: zip attack
           Product: POI
           Version: unspecified
          Hardware: All
            Status: NEW
          Severity: critical
          Priority: P2
         Component: XSSF
          Assignee: dev@poi.apache.org
          Reporter: biandeqiang@huawei.com
  Target Milestone: ---

I try to use this api(WorkbookFactory.create(InputStream input): 

We explode an interface to receive a file inputstream with max size as 1M.
Attackers can produce a file only 1M but with actual size as arround 1G+. It
caused an OOM in our service! Refer to below as the coredump capture.


Object / Stack Frame                                                           
                                                                               
                                                           |Name| Shallow Heap
| Retained Heap |Context Class Loader|Is Daemon
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)             
                                                                               
                                                           |    |             
|               |                    |
at org.apache.commons.io.IOUtils.byteArray(I)[B (IOUtils.java:338)             
                                                                               
                                                           |    |             
|               |                    |
at
org.apache.commons.io.output.AbstractByteArrayOutputStream.toByteArrayImpl()[B
(AbstractByteArrayOutputStream.java:365)                                       
                                                         |    |              | 
             |                    |
at
org.apache.commons.io.output.UnsynchronizedByteArrayOutputStream.toByteArray()[B
(UnsynchronizedByteArrayOutputStream.java:147)                                 
                                                       |    |              |   
           |                    |
at org.apache.poi.util.IOUtils.toByteArray(Ljava/io/InputStream;IIZZ)[B
(IOUtils.java:256)                                                             
                                                                   |    |      
       |               |                    |
at org.apache.poi.util.IOUtils.toByteArray(Ljava/io/InputStream;II)[B
(IOUtils.java:203)                                                             
                                                                     |    |    
         |               |                    |
at
org.apache.poi.openxml4j.util.ZipArchiveFakeEntry.<init>(Lorg/apache/commons/compress/archivers/zip/ZipArchiveEntry;Ljava/io/InputStream;)V
(ZipArchiveFakeEntry.java:82)                                               |  
 |              |               |                    |
at
org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(Lorg/apache/poi/openxml4j/util/ZipArchiveThresholdInputStream;)V
(ZipInputStreamZipEntrySource.java:98)                                         
     |    |              |               |                    |
at
org.apache.poi.openxml4j.opc.ZipPackage.<init>(Ljava/io/InputStream;Lorg/apache/poi/openxml4j/opc/PackageAccess;)V
(ZipPackage.java:132)                                                          
                     |    |              |               |                    |
at
org.apache.poi.openxml4j.opc.OPCPackage.open(Ljava/io/InputStream;)Lorg/apache/poi/openxml4j/opc/OPCPackage;
(OPCPackage.java:312)                                                          
                           |    |              |               |               
    |
at
org.apache.poi.xssf.usermodel.XSSFWorkbookFactory.create(Ljava/io/InputStream;)Lorg/apache/poi/xssf/usermodel/XSSFWorkbook;
(XSSFWorkbookFactory.java:97)                                                  
            |    |              |               |                    |
at
org.apache.poi.xssf.usermodel.XSSFWorkbookFactory.create(Ljava/io/InputStream;)Lorg/apache/poi/ss/usermodel/Workbook;
(XSSFWorkbookFactory.java:36)                                                  
                  |    |              |               |                    |
at
org.apache.poi.ss.usermodel.WorkbookFactory.lambda$create$2(Ljava/io/InputStream;Lorg/apache/poi/ss/usermodel/WorkbookProvider;)Lorg/apache/poi/ss/usermodel/Workbook;
(WorkbookFactory.java:224)                       |    |              |         
     |                    |
at
org.apache.poi.ss.usermodel.WorkbookFactory$$Lambda$1051.create(Lorg/apache/poi/ss/usermodel/WorkbookProvider;)Lorg/apache/poi/ss/usermodel/Workbook;
(Unknown Source)                                                  |    |       
      |               |                    |
at
org.apache.poi.ss.usermodel.WorkbookFactory.wp(Lorg/apache/poi/poifs/filesystem/FileMagic;Lorg/apache/poi/ss/usermodel/WorkbookFactory$ProviderMethod;)Lorg/apache/poi/ss/usermodel/Workbook;
(WorkbookFactory.java:329)|    |              |               |                
   |
at
org.apache.poi.ss.usermodel.WorkbookFactory.create(Ljava/io/InputStream;Ljava/lang/String;)Lorg/apache/poi/ss/usermodel/Workbook;
(WorkbookFactory.java:224)                                                     
      |    |              |               |                    |
at
org.apache.poi.ss.usermodel.WorkbookFactory.create(Ljava/io/InputStream;)Lorg/apache/poi/ss/usermodel/Workbook;
(WorkbookFactory.java:185)                                                     
                        |    |              |               |                  
 |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

--- Comment #9 from cnj_0304 <cn...@qq.com> ---
(In reply to PJ Fanning from comment #8)
> You will need to provide a file that demos an issue.



Sorry, our company prohibits uploading files to the outside world.
You can simply construct a compressed package according to this logic.
The logic is as follows:
The content of each file is the same, for example, aaaaa...., but each file
does not exceed 100 KB. There are more than 10500 files. The total size of the
file is about 1 GB, but the size of the compressed file is only 2008 KB.
In the service environment, the size of the package that can be uploaded is
limited to 2 MB. However, the service memory is only 500 MB. Therefore, when
the service decompresses the ZIP package, the logic of determining the
compression ratio is returned. Therefore, the service is OOM.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

--- Comment #6 from PJ Fanning <fa...@yahoo.com> ---
If you have lots of cell with values then you will use lots of memory. We have
some protections against malicious inputs that might try to fool POI into
creating large arrays but in the end of the day, if you have enough data in the
xlsx, POI will run into issues holding it all in memory.

The stacktrace in the description indicates that POI probably didn't even get
as far as parsing the XML and creating the XMLBeans.

org.apache.poi.openxml4j.opc.ZipPackage.setUseTempFilePackageParts(true) might
help a little.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

PJ Fanning <fa...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |INFORMATIONPROVIDED

--- Comment #2 from PJ Fanning <fa...@yahoo.com> ---
My strong recommendation is to never accept xlsx/docx/pptx/etc (Microsoft OOXML
format) files from untrusted sources. The format is very very dangerous and
that is Microsoft's fault.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

--- Comment #3 from PJ Fanning <fa...@yahoo.com> ---
I wrote this blog about the dangers of parsing MS file formats a while ago.

https://medium.com/system-weakness/caveats-with-accepting-microsoft-office-file-formats-in-uploads-26be3673c330

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

PJ Fanning <fa...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|critical                    |normal

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

cnj_0304 <cn...@qq.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |cnj_0304@qq.com

--- Comment #10 from cnj_0304 <cn...@qq.com> ---
Created attachment 38722
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=38722&action=edit
example bomb

This compressed package contains four parts:

1. Bomb. zip: Verify the test case of OOM, with 4000 files, a total size of
390M, and a compressed file size of 950K;

2. Idea_ VM_ Config. png: Idea configures the memory size of the VM, with only
300M configured for simulating services with limited memory;

3. Test. java: Test the code and validate the service. With limited memory, a
large number of compressed files smaller than 100KB cannot protect against zip
bombs by simply limiting the compression ratio;

4. Example_ OOM. png: Validation result, OOM in IOUtils. byteArray;

Our service limits the size of uploaded compressed packets to 2M, the service
memory size to 500M, and the compression ratio is ZipSecureFile.MIN_ INFLATE_
When RATIO=0.01 takes effect, the service can be processed normally.

But in reality, it cannot protect against the scenario of the above test case.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

--- Comment #4 from PJ Fanning <fa...@yahoo.com> ---
There are APIs that can read xlsx in a streaming way and will use less memory -
but they can still end up using a lot of memory. If you need to parse large
xlsx files, you need an architecture that can handle crashes and automatically
restart - Kubernetes has good support for this (1 option of many).

Streaming samples can be found at
https://poi.apache.org/components/spreadsheet/examples.html

Some libs that stream are listed at
https://poi.apache.org/related-projects.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

cnj_0304 <cn...@qq.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|1                           |0
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INFORMATIONPROVIDED         |---

--- Comment #7 from cnj_0304 <cn...@qq.com> ---
(In reply to PJ Fanning from comment #1)
> We know about this and there isn't much we can do. Anyone who parses
> xlsx/docx/pptx/etc. files will need to be aware that the data is compressed
> and that the resulting decompressed data can be huge.
> 
> We do check for zip bombs - ie data that has a very high ratio when
> decompressing.
> 
> See the setMinInflateRatio documentation in
> https://poi.apache.org/components/configuration.html


However, I found that if the payload size is less than 100 KB, the logic of
comparing compression ratios cannot be executed. I think this code has a bug.

The interfaces you mentioned above lose the meaning of user customization.
For example:
org.apache.poi.openxml4j.util.ZipSecureFile.setMinInflateRatio(double ratio)
org.apache.poi.openxml4j.util.ZipSecureFile.setMaxEntrySize(long maxEntrySize)


Problem Code:
checkThreshold():
        // check the file size first, in case we are working on uncompressed
streams
        if(payloadSize > MAX_ENTRY_SIZE) {
            throw new IOException(String.format(Locale.ROOT,
MAX_ENTRY_SIZE_MSG, payloadSize, rawSize, MAX_ENTRY_SIZE, entryName));
        }

        // don't alert for small expanded size
        if (payloadSize <= GRACE_ENTRY_SIZE) {
            return;    // here is return
        }

        double ratio = rawSize / (double)payloadSize;
        if (ratio >= MIN_INFLATE_RATIO) {
            return;
        }

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

PJ Fanning <fa...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All

--- Comment #1 from PJ Fanning <fa...@yahoo.com> ---
We know about this and there isn't much we can do. Anyone who parses
xlsx/docx/pptx/etc. files will need to be aware that the data is compressed and
that the resulting decompressed data can be huge.

We do check for zip bombs - ie data that has a very high ratio when
decompressing.

See the setMinInflateRatio documentation in
https://poi.apache.org/components/configuration.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

--- Comment #5 from Dominik Stadler <do...@gmx.at> ---
Apache POI already tries hard to not allocate too much main memory irrespective
of the actual document.

Maybe the default limits are simply higher than the amount of memory that you
provide to the JVM. 

So you can try to lower some of the settings listed at
https://poi.apache.org/components/configuration.html in your application before
you invoke Apache POI.

If you cannot avoid the OOM this way a sample document and a minimal
reproducing test-case would be interesting so we can try to improve the support
if there are cases left where these safeguards are not kicking in properly.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 66840] zip attack

Posted by bu...@apache.org.
https://bz.apache.org/bugzilla/show_bug.cgi?id=66840

PJ Fanning <fa...@yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEEDINFO
     Ever confirmed|0                           |1

--- Comment #8 from PJ Fanning <fa...@yahoo.com> ---
You will need to provide a file that demos an issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org