You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Jason Gustafson (Jira)" <ji...@apache.org> on 2020/04/08 18:44:00 UTC

[jira] [Updated] (KAFKA-9835) Race condition with concurrent write allows reads above high watermark

     [ https://issues.apache.org/jira/browse/KAFKA-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Gustafson updated KAFKA-9835:
-----------------------------------
    Affects Version/s: 2.2.2
                       2.3.1
                       2.4.1

> Race condition with concurrent write allows reads above high watermark
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-9835
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9835
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.2.2, 2.3.1, 2.4.1
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Major
>
> Kafka's log implementation serializes all writes using a lock, but allows multiple concurrent reads while that lock is held. The `FileRecords` class contains the core implementation. Reads to the log create logical slices of `FileRecords` which are then passed to the network layer for sending. An abridged version of the implementation of `slice` is provided below:
> {code}
>     public FileRecords slice(int position, int size) throws IOException {
>         int end = this.start + position + size;
>         // handle integer overflow or if end is beyond the end of the file
>         if (end < 0 || end >= start + sizeInBytes())
>             end = start + sizeInBytes();
>         return new FileRecords(file, channel, this.start + position, end, true);
>     }
> {code}
> The `size` parameter here is typically derived from the fetch size, but is upper-bounded with respect to the high watermark. The two calls to `sizeInBytes` here are problematic because the size of the file may change in between them. Specifically a concurrent write may increase the size of the file after the first call to `sizeInBytes` but before the second one. In the worst case, when `size` defines the limit of the high watermark, this can lead to a slice containing uncommitted data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)