You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Robert Cooper (Jira)" <ji...@apache.org> on 2021/02/03 12:34:00 UTC
[jira] [Created] (IO-718) FileUtils.checksumCRC32 and
FileUtils.checksum are not thread safe
Robert Cooper created IO-718:
--------------------------------
Summary: FileUtils.checksumCRC32 and FileUtils.checksum are not thread safe
Key: IO-718
URL: https://issues.apache.org/jira/browse/IO-718
Project: Commons IO
Issue Type: Bug
Components: Utilities
Affects Versions: 2.8.0
Reporter: Robert Cooper
When calling {{FileUtils.checksumCRC32}} from multiple threads (in order to improve throughput when calculating CRC's for a large folder), the code is not thread-safe, resulting in incorrect CRC output.
The following simple test demonstrates the issue:
{code:java}
@Test
public void should() throws ExecutionException, InterruptedException {
File testFile = new File("C:\\Temp\\large-file.txt");
// ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(5);
List<Future<Long>> futures = new ArrayList<>();
for (int i = 0; i < 20; i++) {
futures.add(scheduler.submit(() -> FileUtils.checksumCRC32(testFile)));
}
List<Long> crcs = new ArrayList<>();
for (Future<Long> future : futures) {
crcs.add(future.get());
}
Assertions.assertThat(crcs).allMatch(c -> crcs.get(0).equals(c));
} {code}
In the above code, with a thread-pool size of 1, all calculated CRC's for the file are the same. With a thread-pool size of more, the CRC's differ.
The issue appears to be related to the use of a common {{SKIP_BYTE_BUFFER}} in {{IOUtils.consume}}. The multiple threads all read into the same buffer as the data is being "discarded". However, {{FileUtils.checksum}} uses a {{CheckedInputStream}} to calculate the CRC, which uses the value read into the shared buffer. With multiple threads writing to that buffer the CRC mechanism breaks down.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)