You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/08/04 12:02:00 UTC
[jira] [Work logged] (COMPRESS-542) Corrupt 7z allocates huge amount of SevenZEntries

     [ https://issues.apache.org/jira/browse/COMPRESS-542?focusedWorklogId=466190&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-466190 ]

ASF GitHub Bot logged work on COMPRESS-542:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Aug/20 12:01
            Start Date: 04/Aug/20 12:01
    Worklog Time Spent: 10m 
      Work Description: PeterAlfredLee commented on pull request #120:
URL: https://github.com/apache/commons-compress/pull/120#issuecomment-668553810


   Hi @theobisproject 
   I'm a little interested about the amount of `SevenZArchiveEntry`. How many entries do you have in your 7z archive that take so much memory?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 466190)
    Remaining Estimate: 0h
            Time Spent: 10m

> Corrupt 7z allocates huge amount of SevenZEntries
> -------------------------------------------------
>
>                 Key: COMPRESS-542
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-542
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.20
>            Reporter: Robin Schimpf
>            Priority: Major
>         Attachments: Reduced_memory_allocation_for_corrupted_7z_archives.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We ran into a problem where a 1.43GB corrupt 7z file tried to allocate about 138 million SevenZArchiveEntries which will use about 12GB of memory. Sadly I'm unable to share the file. If you have enough Memory available the following exception is thrown.
> {code:java}
> java.io.IOException: Start header corrupt and unable to guess end Header
> 	at org.apache.commons.compress.archivers.sevenz.SevenZFile.tryToLocateEndHeader(SevenZFile.java:511)
> 	at org.apache.commons.compress.archivers.sevenz.SevenZFile.readHeaders(SevenZFile.java:470)
> 	at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:336)
> 	at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:128)
> 	at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:369)
> {code}
> 7z itself aborts really quick when I'm trying to list the content of the file.
> {code:java}
> 7z l "corrupt.7z"
> 7-Zip 18.01 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-01-28
> Scanning the drive for archives:
> 1 file, 1537752212 bytes (1467 MiB)
> Listing archive: corrupt.7z
> ERROR: corrupt.7z : corrupt.7z
> Open ERROR: Can not open the file as [7z] archive
> ERRORS:
> Is not archive
> Errors: 1
> {code}
> I hacked together the attached patch which will reduce the memory allocation to about 1GB. So lazy instantiation of the entries could be a good solution to the problem. Optimal would be to only create the entries if the headers could be parsed correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)