You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Daniel Lowe (JIRA)" <ji...@apache.org> on 2012/06/26 23:30:44 UTC

[jira] [Created] (COMPRESS-189) ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file

Daniel Lowe created COMPRESS-189:
------------------------------------

             Summary: ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file
                 Key: COMPRESS-189
                 URL: https://issues.apache.org/jira/browse/COMPRESS-189
             Project: Commons Compress
          Issue Type: Bug
          Components: Archivers
    Affects Versions: 1.4.1
         Environment: JDK 1.6 64-bit, Windows 7
            Reporter: Daniel Lowe


When the following code is run an error "Underlying input stream returned zero bytes" is produced. If the commented line is uncommented it can be seen that the ZipArchiveInputStream returned 0 bytes. This only happens to the first time read is called, subsequent calls work as expected i.e. the following code actually works correctly with that line uncommented!

The zip file used to produce this behavious is available at http://wwmm.ch.cam.ac.uk/~dl387/test.ZIP

If this is not the correct way of processing a zip file of zip files please let me know. Also I believe whilst ZipFile can iterate over entries fast due to being able to look at the master table whilst ZipArchiveInputStream cannot. Is there anyway of instantiating a ZipFile from a zip file inside another zip file without first extracting the nested zip file?

    ZipFile zipFile = new ZipFile("C:/test.ZIP");
    for (Enumeration<ZipArchiveEntry> iterator = zipFile.getEntries(); iterator.hasMoreElements();) {
      ZipArchiveEntry entry = iterator.nextElement();
      InputStream is = new BufferedInputStream(zipFile.getInputStream(entry));
      ZipArchiveInputStream zipInput = new ZipArchiveInputStream(is);
      ZipArchiveEntry innerEntry;
      while ((innerEntry = zipInput.getNextZipEntry()) != null){
        if (innerEntry.getName().endsWith("XML")){
          //zipInput.read();
          System.out.println(IOUtils.toString(zipInput));
        }
      }
    }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COMPRESS-189) ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file

Posted by "Daniel Lowe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COMPRESS-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467058#comment-13467058 ] 

Daniel Lowe commented on COMPRESS-189:
--------------------------------------

This issue can be worked around by reading while 0 or more bytes are read. Unfortunately Java's built in InputStreamReader will throw an exception is 0 bytes are read which is the underlying reason why in the example above Commons-IO throws an exception.

Marking and resetting seems to be a reasonable way to workaround the problem:
BufferedInputStream archiveBis = new BufferedInputStream(archiveInput);
archiveBis.mark(1);
archiveBis.read();//if the bug is encountered nothing will be read
archiveBis.reset();

Obviously a proper fix would be better
                
> ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file
> --------------------------------------------------------------------------
>
>                 Key: COMPRESS-189
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-189
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.4.1
>         Environment: JDK 1.6 64-bit, Windows 7
>            Reporter: Daniel Lowe
>
> When the following code is run an error "Underlying input stream returned zero bytes" is produced. If the commented line is uncommented it can be seen that the ZipArchiveInputStream returned 0 bytes. This only happens the first time read is called, subsequent calls work as expected i.e. the following code actually works correctly with that line uncommented!
> The zip file used to produce this behavious is available at http://wwmm.ch.cam.ac.uk/~dl387/test.ZIP
> If this is not the correct way of processing a zip file of zip files please let me know. Also I believe whilst ZipFile can iterate over entries fast due to being able to look at the master table whilst ZipArchiveInputStream cannot. Is there anyway of instantiating a ZipFile from a zip file inside another zip file without first extracting the nested zip file?
>     ZipFile zipFile = new ZipFile("C:/test.ZIP");
>     for (Enumeration<ZipArchiveEntry> iterator = zipFile.getEntries(); iterator.hasMoreElements(); ) {
>       ZipArchiveEntry entry = iterator.nextElement();
>       InputStream is = new BufferedInputStream(zipFile.getInputStream(entry));
>       ZipArchiveInputStream zipInput = new ZipArchiveInputStream(is);
>       ZipArchiveEntry innerEntry;
>       while ((innerEntry = zipInput.getNextZipEntry()) != null){
>         if (innerEntry.getName().endsWith("XML")){
>           //zipInput.read();
>           System.out.println(IOUtils.toString(zipInput));
>         }
>       }
>     }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COMPRESS-189) ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file

Posted by "Daniel Lowe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COMPRESS-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473473#comment-13473473 ] 

Daniel Lowe commented on COMPRESS-189:
--------------------------------------

The bug is due to a bug in the implementation of ZipArchiveInputStream's readDeflated:

int read = 0;
try {
read = inf.inflate(buffer, start, length);
} catch (DataFormatException e) {
throw new ZipException(e.getMessage());
}
if (read == 0) {
if (inf.finished()) {
return -1;
} else if (buf.lengthOfLastRead == -1) {
throw new IOException("Truncated ZIP file");
}
}

can return 0
"A return value of 0 indicates that needsInput() or needsDictionary() should be called in order to determine if more input data or a preset dictionary is required."

The assumedly correct implementation in InflaterInputStream:

int n;
while ((n = inf.inflate(b, off, len)) == 0) {
if (inf.finished() || inf.needsDictionary()) {
reachEOF = true;
return -1;
}
if (inf.needsInput()) {
fill();
}
}

calls the method in a while loop to avoid this problem.
                
> ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file
> --------------------------------------------------------------------------
>
>                 Key: COMPRESS-189
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-189
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.4.1
>         Environment: JDK 1.6 64-bit, Windows 7
>            Reporter: Daniel Lowe
>
> When the following code is run an error "Underlying input stream returned zero bytes" is produced. If the commented line is uncommented it can be seen that the ZipArchiveInputStream returned 0 bytes. This only happens the first time read is called, subsequent calls work as expected i.e. the following code actually works correctly with that line uncommented!
> The zip file used to produce this behavious is available at http://wwmm.ch.cam.ac.uk/~dl387/test.ZIP
> If this is not the correct way of processing a zip file of zip files please let me know. Also I believe whilst ZipFile can iterate over entries fast due to being able to look at the master table whilst ZipArchiveInputStream cannot. Is there anyway of instantiating a ZipFile from a zip file inside another zip file without first extracting the nested zip file?
>     ZipFile zipFile = new ZipFile("C:/test.ZIP");
>     for (Enumeration<ZipArchiveEntry> iterator = zipFile.getEntries(); iterator.hasMoreElements(); ) {
>       ZipArchiveEntry entry = iterator.nextElement();
>       InputStream is = new BufferedInputStream(zipFile.getInputStream(entry));
>       ZipArchiveInputStream zipInput = new ZipArchiveInputStream(is);
>       ZipArchiveEntry innerEntry;
>       while ((innerEntry = zipInput.getNextZipEntry()) != null){
>         if (innerEntry.getName().endsWith("XML")){
>           //zipInput.read();
>           System.out.println(IOUtils.toString(zipInput));
>         }
>       }
>     }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (COMPRESS-189) ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file

Posted by "Daniel Lowe (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COMPRESS-189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Lowe updated COMPRESS-189:
---------------------------------

    Priority: Blocker  (was: Major)
    
> ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file
> --------------------------------------------------------------------------
>
>                 Key: COMPRESS-189
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-189
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.4.1
>         Environment: JDK 1.6 64-bit, Windows 7
>            Reporter: Daniel Lowe
>            Priority: Blocker
>
> When the following code is run an error "Underlying input stream returned zero bytes" is produced. If the commented line is uncommented it can be seen that the ZipArchiveInputStream returned 0 bytes. This only happens the first time read is called, subsequent calls work as expected i.e. the following code actually works correctly with that line uncommented!
> The zip file used to produce this behavious is available at http://wwmm.ch.cam.ac.uk/~dl387/test.ZIP
> If this is not the correct way of processing a zip file of zip files please let me know. Also I believe whilst ZipFile can iterate over entries fast due to being able to look at the master table whilst ZipArchiveInputStream cannot. Is there anyway of instantiating a ZipFile from a zip file inside another zip file without first extracting the nested zip file?
>     ZipFile zipFile = new ZipFile("C:/test.ZIP");
>     for (Enumeration<ZipArchiveEntry> iterator = zipFile.getEntries(); iterator.hasMoreElements(); ) {
>       ZipArchiveEntry entry = iterator.nextElement();
>       InputStream is = new BufferedInputStream(zipFile.getInputStream(entry));
>       ZipArchiveInputStream zipInput = new ZipArchiveInputStream(is);
>       ZipArchiveEntry innerEntry;
>       while ((innerEntry = zipInput.getNextZipEntry()) != null){
>         if (innerEntry.getName().endsWith("XML")){
>           //zipInput.read();
>           System.out.println(IOUtils.toString(zipInput));
>         }
>       }
>     }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COMPRESS-189) ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file

Posted by "Dmitry Katsubo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COMPRESS-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498834#comment-13498834 ] 

Dmitry Katsubo commented on COMPRESS-189:
-----------------------------------------

I confirm the problem. I agree that {{java.util.zip.Inflater}} is not used according to JavaDocs and that causes the problem in some cases.
                
> ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file
> --------------------------------------------------------------------------
>
>                 Key: COMPRESS-189
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-189
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.4.1
>         Environment: JDK 1.6 64-bit, Windows 7
>            Reporter: Daniel Lowe
>            Priority: Blocker
>
> When the following code is run an error "Underlying input stream returned zero bytes" is produced. If the commented line is uncommented it can be seen that the ZipArchiveInputStream returned 0 bytes. This only happens the first time read is called, subsequent calls work as expected i.e. the following code actually works correctly with that line uncommented!
> The zip file used to produce this behavious is available at http://wwmm.ch.cam.ac.uk/~dl387/test.ZIP
> If this is not the correct way of processing a zip file of zip files please let me know. Also I believe whilst ZipFile can iterate over entries fast due to being able to look at the master table whilst ZipArchiveInputStream cannot. Is there anyway of instantiating a ZipFile from a zip file inside another zip file without first extracting the nested zip file?
>     ZipFile zipFile = new ZipFile("C:/test.ZIP");
>     for (Enumeration<ZipArchiveEntry> iterator = zipFile.getEntries(); iterator.hasMoreElements(); ) {
>       ZipArchiveEntry entry = iterator.nextElement();
>       InputStream is = new BufferedInputStream(zipFile.getInputStream(entry));
>       ZipArchiveInputStream zipInput = new ZipArchiveInputStream(is);
>       ZipArchiveEntry innerEntry;
>       while ((innerEntry = zipInput.getNextZipEntry()) != null){
>         if (innerEntry.getName().endsWith("XML")){
>           //zipInput.read();
>           System.out.println(IOUtils.toString(zipInput));
>         }
>       }
>     }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (COMPRESS-189) ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file

Posted by "Daniel Lowe (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COMPRESS-189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Lowe updated COMPRESS-189:
---------------------------------

    Description: 
When the following code is run an error "Underlying input stream returned zero bytes" is produced. If the commented line is uncommented it can be seen that the ZipArchiveInputStream returned 0 bytes. This only happens the first time read is called, subsequent calls work as expected i.e. the following code actually works correctly with that line uncommented!

The zip file used to produce this behavious is available at http://wwmm.ch.cam.ac.uk/~dl387/test.ZIP

If this is not the correct way of processing a zip file of zip files please let me know. Also I believe whilst ZipFile can iterate over entries fast due to being able to look at the master table whilst ZipArchiveInputStream cannot. Is there anyway of instantiating a ZipFile from a zip file inside another zip file without first extracting the nested zip file?

    ZipFile zipFile = new ZipFile("C:/test.ZIP");
    for (Enumeration<ZipArchiveEntry> iterator = zipFile.getEntries(); iterator.hasMoreElements(); ) {
      ZipArchiveEntry entry = iterator.nextElement();
      InputStream is = new BufferedInputStream(zipFile.getInputStream(entry));
      ZipArchiveInputStream zipInput = new ZipArchiveInputStream(is);
      ZipArchiveEntry innerEntry;
      while ((innerEntry = zipInput.getNextZipEntry()) != null){
        if (innerEntry.getName().endsWith("XML")){
          //zipInput.read();
          System.out.println(IOUtils.toString(zipInput));
        }
      }
    }

  was:
When the following code is run an error "Underlying input stream returned zero bytes" is produced. If the commented line is uncommented it can be seen that the ZipArchiveInputStream returned 0 bytes. This only happens to the first time read is called, subsequent calls work as expected i.e. the following code actually works correctly with that line uncommented!

The zip file used to produce this behavious is available at http://wwmm.ch.cam.ac.uk/~dl387/test.ZIP

If this is not the correct way of processing a zip file of zip files please let me know. Also I believe whilst ZipFile can iterate over entries fast due to being able to look at the master table whilst ZipArchiveInputStream cannot. Is there anyway of instantiating a ZipFile from a zip file inside another zip file without first extracting the nested zip file?

    ZipFile zipFile = new ZipFile("C:/test.ZIP");
    for (Enumeration<ZipArchiveEntry> iterator = zipFile.getEntries(); iterator.hasMoreElements();) {
      ZipArchiveEntry entry = iterator.nextElement();
      InputStream is = new BufferedInputStream(zipFile.getInputStream(entry));
      ZipArchiveInputStream zipInput = new ZipArchiveInputStream(is);
      ZipArchiveEntry innerEntry;
      while ((innerEntry = zipInput.getNextZipEntry()) != null){
        if (innerEntry.getName().endsWith("XML")){
          //zipInput.read();
          System.out.println(IOUtils.toString(zipInput));
        }
      }
    }

    
> ZipArchiveInputStream may read 0 bytes when reading from a nested Zip file
> --------------------------------------------------------------------------
>
>                 Key: COMPRESS-189
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-189
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.4.1
>         Environment: JDK 1.6 64-bit, Windows 7
>            Reporter: Daniel Lowe
>
> When the following code is run an error "Underlying input stream returned zero bytes" is produced. If the commented line is uncommented it can be seen that the ZipArchiveInputStream returned 0 bytes. This only happens the first time read is called, subsequent calls work as expected i.e. the following code actually works correctly with that line uncommented!
> The zip file used to produce this behavious is available at http://wwmm.ch.cam.ac.uk/~dl387/test.ZIP
> If this is not the correct way of processing a zip file of zip files please let me know. Also I believe whilst ZipFile can iterate over entries fast due to being able to look at the master table whilst ZipArchiveInputStream cannot. Is there anyway of instantiating a ZipFile from a zip file inside another zip file without first extracting the nested zip file?
>     ZipFile zipFile = new ZipFile("C:/test.ZIP");
>     for (Enumeration<ZipArchiveEntry> iterator = zipFile.getEntries(); iterator.hasMoreElements(); ) {
>       ZipArchiveEntry entry = iterator.nextElement();
>       InputStream is = new BufferedInputStream(zipFile.getInputStream(entry));
>       ZipArchiveInputStream zipInput = new ZipArchiveInputStream(is);
>       ZipArchiveEntry innerEntry;
>       while ((innerEntry = zipInput.getNextZipEntry()) != null){
>         if (innerEntry.getName().endsWith("XML")){
>           //zipInput.read();
>           System.out.println(IOUtils.toString(zipInput));
>         }
>       }
>     }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira