You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Matthijs Laan (Jira)" <ji...@apache.org> on 2021/08/04 09:50:00 UTC
[jira] [Comment Edited] (COMPRESS-585) ZipFile fails to read a
zipfile with a comment or extra data longer than 8024 bytes
[ https://issues.apache.org/jira/browse/COMPRESS-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17392942#comment-17392942 ]
Matthijs Laan edited comment on COMPRESS-585 at 8/4/21, 9:49 AM:
-----------------------------------------------------------------
Yes, see the following test case:
{code:java}
@Test
public void testCommentLongerThanCopyBufSize() throws Exception {
final Field COPY_BUF_SIZE = IOUtils.class.getDeclaredField("COPY_BUF_SIZE");
COPY_BUF_SIZE.setAccessible(true);
final int copyBufSize = (int)COPY_BUF_SIZE.get(null);
String repeatingString = "...a comment that is longer than the copy buf size";
// Repeat the comment string until we are sure to exceed the COPY_BUF_SIZE
int numberOfRepeats = (copyBufSize / repeatingString.length()) + 1;
String comment = String.join("", Collections.nCopies(numberOfRepeats, repeatingString));
final File f = File.createTempFile("commons-compress-zipfiletest-longcomment", ".zip");
f.deleteOnExit();
try (final OutputStream o = Files.newOutputStream(f.toPath());
final ZipArchiveOutputStream zo = new ZipArchiveOutputStream(o)) {
ZipArchiveEntry ze = new ZipArchiveEntry("foo");
ze.setComment(comment);
zo.putArchiveEntry(ze);
zo.write("bar".getBytes(UTF_8));
zo.closeArchiveEntry();
}
try(final ZipFile zip = new ZipFile(FileChannel.open(f.toPath(), StandardOpenOption.READ))) {
ZipArchiveEntry entry = zip.getEntry("foo");
assertNotNull(entry);
assertEquals(comment, entry.getComment());
}
}{code}
My understanding of {{IOUtils.readRange()}} was incorrect, it reads more instead of less.
This testcase creates a zip file with a 8050 length comment. In my case {{IOUtils.readRange()}} returns 8072 bytes instead, although this may depend on the OS.
The cause is the same as COMPRESS-584, see [https://github.com/apache/commons-compress/pull/214]. With this {{IOUtils.readRange()}} fix it works:
{code:java}
public static byte[] readRange(final ReadableByteChannel input, final int len) throws IOException {
final ByteArrayOutputStream output = new ByteArrayOutputStream();
final ByteBuffer b = ByteBuffer.allocate(Math.min(len, COPY_BUF_SIZE));
int read = 0;
while (read < len) {
// Make sure we never read more than len bytes
b.limit(Math.min(len - read, b.remaining()));
final int readNow = input.read(b);
if (readNow <= 0) {
break;
}
output.write(b.array(), 0, readNow);
b.rewind();
read += readNow;
}{code}
was (Author: matthijsln):
Yes, see the following test case:
{code:java}
@Test
public void testCommentLongerThanCopyBufSize() throws Exception {
final Field COPY_BUF_SIZE = IOUtils.class.getDeclaredField("COPY_BUF_SIZE");
COPY_BUF_SIZE.setAccessible(true);
final int copyBufSize = (int)COPY_BUF_SIZE.get(null);
String repeatingString = "...a comment that is longer than the copy buf size";
// Repeat the comment string until we are sure to exceed the COPY_BUF_SIZE
int numberOfRepeats = (copyBufSize / repeatingString.length()) + 1;
String comment = String.join("", Collections.nCopies(numberOfRepeats, repeatingString));
final File f = File.createTempFile("commons-compress-zipfiletest-longcomment", ".zip");
f.deleteOnExit();
try (final OutputStream o = Files.newOutputStream(f.toPath());
final ZipArchiveOutputStream zo = new ZipArchiveOutputStream(o)) {
ZipArchiveEntry ze = new ZipArchiveEntry("foo");
ze.setComment(comment);
zo.putArchiveEntry(ze);
zo.write("bar".getBytes(UTF_8));
zo.closeArchiveEntry();
}
try(final ZipFile zip = new ZipFile(FileChannel.open(f.toPath(), StandardOpenOption.READ))) {
ZipArchiveEntry entry = zip.getEntry("foo");
assertNotNull(entry);
assertEquals(comment, entry.getComment());
}
}{code}
My understanding of {{IOUtils.readRange()}} was incorrect, it reads more instead of less.
This testcase creates a zip file with a 8050 length comment. In my case `IOUtils.readRange()` returns 8072 bytes instead, although this may depend on the `FileChannel` implementation.
The cause is the same as COMPRESS-584, see [https://github.com/apache/commons-compress/pull/214]. With this {{IOUtils.readRange()}} fix it works:
{code:java}
public static byte[] readRange(final ReadableByteChannel input, final int len) throws IOException {
final ByteArrayOutputStream output = new ByteArrayOutputStream();
final ByteBuffer b = ByteBuffer.allocate(Math.min(len, COPY_BUF_SIZE));
int read = 0;
while (read < len) {
// Make sure we never read more than len bytes
b.limit(Math.min(len - read, b.remaining()));
final int readNow = input.read(b);
if (readNow <= 0) {
break;
}
output.write(b.array(), 0, readNow);
b.rewind();
read += readNow;
}{code}
> ZipFile fails to read a zipfile with a comment or extra data longer than 8024 bytes
> -----------------------------------------------------------------------------------
>
> Key: COMPRESS-585
> URL: https://issues.apache.org/jira/browse/COMPRESS-585
> Project: Commons Compress
> Issue Type: Bug
> Components: Archivers
> Affects Versions: 1.21
> Reporter: Matthijs Laan
> Priority: Minor
> Attachments: COMPRESS-585-test.patch
>
>
> See the attached patch for a unit test demonstrating the issue with a long comment.
> The cause is that {{ZipFile.readCentralDirectoryEntry()}} calls {{IOUtils.readRange()}} and assumes if it returns less than the length it asked for that the EOF is reached, however this is not how that method works: it returns max COPY_BUF_SIZE bytes (8024), even if EOF has not been reached.
> Besides comments and extra data (in the central directory or local file header) longer than 8024 bytes the only other place {{readRange()}} is called is reading filenames, but that seems like a remote edge case and an EOF exception is fine.
> The IOUtils.readRange() JavaDoc does not specify how it communicates EOF. With a blocking channel this would be when it returns a zero length array. It could throw an exception when {{Channel.read()}} returns 0 bytes, because that only happens on non-blocking channels.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)