You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "A Kelday (Jira)" <ji...@apache.org> on 2020/06/01 21:54:00 UTC

[jira] [Commented] (COMPRESS-514) SevenZFile fails with encoded header over 2GiB

    [ https://issues.apache.org/jira/browse/COMPRESS-514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17121361#comment-17121361 ] 

A Kelday commented on COMPRESS-514:
-----------------------------------

Thanks [~bodewig] and [~peterlee] for the input. Happy to work on it more at some point if you choose an option (I'll keep thinking about it anyway and do some more checks).

Peter explained my concern exactly: that in most cases given corrupt data, we could expect an exception other than the one triggered by the CRC check to happen _before_ the end of stream is ever reached (because we aren't just transferring data, we're branching based on it). That's really a best case, because worse than that is some garbage filename list being created. What I'm very conscious of is making the common use case code worse.

> SevenZFile fails with encoded header over 2GiB
> ----------------------------------------------
>
>                 Key: COMPRESS-514
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-514
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.20
>            Reporter: A Kelday
>            Priority: Minor
>         Attachments: HeaderChannelBuffer.java
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When reading what some may call a large encrypted 7zip file (1.2TB with 22 million files), the read fails at the header stage with the trace below. Is this within the spec? I've written some code to handle it, because I did actually need to extract the file in java. If that's of any use I can provide it (it's a naive wrapper that just pages in a buffer at a time).
>  
> {code:java}
> Exception in thread "main" java.io.IOException: Cannot handle unpackSize2416988886
> at org.apache.commons.compress.archivers.sevenz.SevenZFile.assertFitsIntoInt(SevenZFile.java:1523)
> at org.apache.commons.compress.archivers.sevenz.SevenZFile.readEncodedHeader(SevenZFile.java:622)
> at org.apache.commons.compress.archivers.sevenz.SevenZFile.initializeArchive(SevenZFile.java:532)
> at org.apache.commons.compress.archivers.sevenz.SevenZFile.readHeaders(SevenZFile.java:468)
> at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:337)
> at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:129)
> at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:116)
> {code}
> 7zip itself can also open it (and display/extract etc.), here are the stats:
>  
>  
> {code:java}
> Size: 2 489 903 580 875
> Packed Size: 1 349 110 308 832
> Folders: 40 005
> Files: 22 073 957
> CRC: E26F6A96
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)