You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Stefan Bodewig (JIRA)" <ji...@apache.org> on 2010/03/15 14:54:27 UTC

[jira] Commented: (COMPRESS-103) allow data descriptors to follow STORED entries in ZIP archives being read

    [ https://issues.apache.org/jira/browse/COMPRESS-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845310#action_12845310 ] 

Stefan Bodewig commented on COMPRESS-103:
-----------------------------------------

This is what the InfoZIP appnote.iz has to say

{quote}
          Bit 3: If this bit is set, the fields crc-32, compressed
                 size and uncompressed size are set to zero in the
                 local header.  The correct values are put in the
                 data descriptor immediately following the compressed
                 data.  (Note: PKZIP version 2.04g for DOS only
                 recognizes this bit for method 8 compression, newer
                 versions of PKZIP recognize this bit for any
                 compression method.)
                [Info-ZIP note: This bit was introduced by PKZIP 2.04 for
                 DOS. In general, this feature can only be reliably used
                 together with compression methods that allow intrinsic
                 detection of the "end-of-compressed-data" condition. From
                 the set of compression methods described in this Zip archive
                 specification, only "deflate" and "bzip2" fulfill this
                 requirement.
                 Especially, the method STORED does not work!
                 The Info-ZIP tools recognize this bit regardless of the
                 compression method; but, they rely on correctly set
                 "compressed size" information in the central directory entry.]
{quote}

so ZipFile uses the same approach as the InfoZIP tools.  If we were to take the same approach in ZipArchiveInputStream we'd have to consume the whole stream and store its content for future use once we hit a STORED entry that uses the data descriptor.

> allow data descriptors to follow STORED entries in ZIP archives being read
> --------------------------------------------------------------------------
>
>                 Key: COMPRESS-103
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-103
>             Project: Commons Compress
>          Issue Type: New Feature
>    Affects Versions: 1.0
>            Reporter: Stefan Bodewig
>            Priority: Minor
>
> the document named "Word XPS.xps" found under http://www.wssdemo.com/XPS/Forms/AllItems.aspx contains at least one STORED entry that uses a data descriptor after the entries' data to hold size and CRC information.
> The ZipFile class uses information from the central directory and thus knows the size of the entry and can deal with the archive.  ZipArchiveInputStream currently can't.
> One solution would be to read the entry until we hit the signature of a data descriptor, local file header or the start of the central directory.  If we hit another LFH or the CD then the data descriptor didn't use the signature (see COMPRESS-101 ) and the last 12 bytes read have already been the data descriptor.  This will certainly not be very efficient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.