You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Tommy Becker (Jira)" <ji...@apache.org> on 2020/07/29 17:47:00 UTC

[jira] [Comment Edited] (KAFKA-10324) Pre-0.11 consumers can get stuck when messages are downconverted from V2 format

    [ https://issues.apache.org/jira/browse/KAFKA-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167400#comment-17167400 ] 

Tommy Becker edited comment on KAFKA-10324 at 7/29/20, 5:46 PM:
----------------------------------------------------------------

[~hachikuji] wrt why the broker does not seem to send subsequent batches, I'm not sure. But I can tell you I see this behavior even with max.partition.fetch.bytes set to Integer.MAX_VALUE. Maybe this has something to do with down conversion?  Anyway, here's an excerpt from a dump of the segment containing the problematic offset, which is 13920987:

 

baseOffset: 13920966 lastOffset: 13920987 count: 6 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 49 isTransactional: false isControl: false position: 98516844 CreateTime: 1595224747691 size: 4407 magic: 2 compresscodec: NONE crc: 1598305187 isvalid: true
| offset: 13920978 CreateTime: 1595224747691 keysize: 36 valuesize: 681 sequence: -1 headerKeys: []
| offset: 13920979 CreateTime: 1595224747691 keysize: 36 valuesize: 677 sequence: -1 headerKeys: []
| offset: 13920980 CreateTime: 1595224747691 keysize: 36 valuesize: 680 sequence: -1 headerKeys: []
| offset: 13920984 CreateTime: 1595224747691 keysize: 36 valuesize: 681 sequence: -1 headerKeys: []
| offset: 13920985 CreateTime: 1595224747691 keysize: 36 valuesize: 677 sequence: -1 headerKeys: []
| offset: 13920986 CreateTime: 1595224747691 keysize: 36 valuesize: 680 sequence: -1 headerKeys: []
End of segment is here

 

 


was (Author: twbecker):
[~hachikuji] wrt why the broker does not seem to send subsequent batches, I'm not sure. But I can tell you I see this behavior even with max.partition.fetch.bytes set to Integer.MAX_VALUE. Maybe this has something to do with down conversion?  Anyway, here's an excerpt from a dump of the segment containing the problematic offset, which is 13920987:

 

baseOffset: 13920966 lastOffset: 13920987 count: 6 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 49 isTransactional: false isControl: false position: 98516844 CreateTime: 1595224747691 size: 4407 magic: 2 compresscodec: NONE crc: 1598305187 isvalid: true
| offset: 13920978 CreateTime: 1595224747691 keysize: 36 valuesize: 681 sequence: -1 headerKeys: []
| offset: 13920979 CreateTime: 1595224747691 keysize: 36 valuesize: 677 sequence: -1 headerKeys: []
| offset: 13920980 CreateTime: 1595224747691 keysize: 36 valuesize: 680 sequence: -1 headerKeys: []
| offset: 13920984 CreateTime: 1595224747691 keysize: 36 valuesize: 681 sequence: -1 headerKeys: []
| offset: 13920985 CreateTime: 1595224747691 keysize: 36 valuesize: 677 sequence: -1 headerKeys: []
| offset: 13920986 CreateTime: 1595224747691 keysize: 36 valuesize: 680 sequence: -1 headerKeys: []

### End of segment is here

 

 

> Pre-0.11 consumers can get stuck when messages are downconverted from V2 format
> -------------------------------------------------------------------------------
>
>                 Key: KAFKA-10324
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10324
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Tommy Becker
>            Priority: Major
>
> As noted in KAFKA-5443, The V2 message format preserves a batch's lastOffset even if that offset gets removed due to log compaction. If a pre-0.11 consumer seeks to such an offset and issues a fetch, it will get an empty batch, since offsets prior to the requested one are filtered out during down-conversion. KAFKA-5443 added consumer-side logic to advance the fetch offset in this case, but this leaves old consumers unable to consume these topics.
> The exact behavior varies depending on consumer version. The 0.10.0.0 consumer throws RecordTooLargeException and dies, believing that the record must not have been returned because it was too large. The 0.10.1.0 consumer simply spins fetching the same empty batch over and over.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)