You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Wenbing Shen (Jira)" <ji...@apache.org> on 2021/04/30 14:49:00 UTC
[jira] [Commented] (KAFKA-12734) LazyTimeIndex & LazyOffsetIndex may cause niobufferoverflow when skip activeSegment sanityCheck

    [ https://issues.apache.org/jira/browse/KAFKA-12734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337435#comment-17337435 ] 

Wenbing Shen commented on KAFKA-12734:
--------------------------------------

This is a problem we often encounter before applying LazyIndex. SanityCheck will help us rebuild the index, but after applying LazyIndex, it skips checking all index files, resulting in abnormal index files in the active segment. At this time, appending data to the log will cause niobufferoverflow exception.

!image-2021-04-30-22-49-24-202.png!

> LazyTimeIndex & LazyOffsetIndex may cause niobufferoverflow when skip activeSegment  sanityCheck
> ------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-12734
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12734
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 2.4.0, 2.5.0, 2.6.0, 2.7.0, 2.8.0
>            Reporter: Wenbing Shen
>            Assignee: Wenbing Shen
>            Priority: Blocker
>         Attachments: LoadIndex.png, image-2021-04-30-22-49-24-202.png, niobufferoverflow.png
>
>
> This question is similar to KAFKA-9156
> We introduced Lazy Index, which helps us skip checking the index files of all log segments when starting kafka, which has greatly improved the speed of our kafka startup.
> Unfortunately, it skips the index file detection of the active segment. The active segment will receive write requests from the client or the replica synchronization thread.
> There is a situation when we skip the index detection of all segments, and we do not need to recover the unflushed log segment, and the index file of the last active segment is damaged at this time. When appending data to the active segment, at this time The program reported an error.
> Below are the problems I encountered in the production environment:
> When Kafka starts to load the log segment, I see in the program log that the memory mapping position of the index file with timestamp and offset is at the larger position of the current index file, but in fact, the index file is not written With so many index items, I guess this kind of problem will occur during the kafka startup process. When kafka has not been started yet, stop the kafka process at this time, and then start the kafka process again, whether it will cause the limit address of the index file memory map to be a file The maximum value is not cut to the actual size used, which will cause the memory map position to be set to limit when Kafka is started.
>  At this time, adding data to the active segment will cause niobufferoverflow.
> I agree to skip the index detection of all inactive segments, because in fact they will no longer receive write requests, but for active segments, we need to perform index file detection.
>  Another situation is that we have CleanShutdown, but due to some factors, the index file of the active segment sets the position of the memory map to limit, resulting in a niobuffer overflow in the write



--
This message was sent by Atlassian Jira
(v8.3.4#803005)