You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2020/09/11 02:29:07 UTC

[GitHub] [commons-compress] PeterAlfredLee commented on pull request #137: Compress-555 allow reading of stored entries in zips by default

PeterAlfredLee commented on pull request #137:
URL: https://github.com/apache/commons-compress/pull/137#issuecomment-690835644


   Some explanizations for the `allowStoredEntriesWithDataDescriptor`:
   
   An entry using STORED method means it would not use any compression - it's just a memcopy. For most of time, the size and compressed size(they are equal in the case of STORED) are stored in the Local File Header - which locates before the file raw data. So we can directly read the accurate amout of bytes, because we already know the amout of data before we start to extract the file.
   
   This changes if the entry is using STORED and data descriptor at the same time. The size and compressed size is stored in the data descriptor, and the data descriptor locates after the raw file data. With data descriptor, we do not know the size of the file before we start to extract the file. We do not know the accurate size of data before we start extracting file, so we need to check if a signature(signature of Local File Header, Central Directory Record or Data Descriptor) is met byte by byte.
   
   Obviously, it is slower if the data descriptor is used for STORED. What's worse, there are some cases lead to a error extracting : some bytes in the raw file data may equal to signature, and compress will stop reading at a wrong location. This is mostly happened in a zip archive that contains another zip archive. And wo do not have any workaround here.
   
   In short words, setting `allowStoredEntriesWithDataDescriptor` to be true for STROED entries would lead to a slower extraction, and for some times it would lead to a failed extraction(e.g. zip archive in zip).
   
   That's why we use `allowStoredEntriesWithDataDescriptor` and the default value is false. This may be a little complicated, but I hope it helps.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org