You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Jiangtao Liu (Jira)" <ji...@apache.org> on 2020/04/15 21:19:00 UTC
[jira] [Updated] (KAFKA-8270) Kafka timestamp-based retention policy will not work when Kafka client has out of sync system clock issue.

     [ https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jiangtao Liu updated KAFKA-8270:
--------------------------------
    Summary: Kafka timestamp-based retention policy will not work when Kafka client has out of sync system clock issue.  (was: Kafka timestamp-based retention policy will not work when Kafka client has system clock issue.)

> Kafka timestamp-based retention policy will not work when Kafka client has out of sync system clock issue.
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-8270
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8270
>             Project: Kafka
>          Issue Type: Bug
>          Components: log, log cleaner, logging
>    Affects Versions: 1.1.1
>            Reporter: Jiangtao Liu
>            Priority: Major
>              Labels: storage
>         Attachments: space issue.png
>
>
> What's the issue?
> {quote} # There were log segments, which can not be deleted over configured retention hours.{quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a smaller size. e.g log segment may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.
>  # It will be a way used to attack Kafka server.{quote}
> What's workaround adopted to resolve this issue?
> {quote} # If it's already happened on your Kafka system, you will need to run a very tricky steps to resolve it.
>  # If it has not happened on your Kafka system yet, you may need to evaluate whether you can switch to LogAppendTime for log.message.timestamp.type. {quote}
> What are the reproduce steps?
> {quote} # Make sure Kafka client and server are not hosted in the same machine.
>  # Configure log.message.timestamp.type with *CreateTime*, not LogAppendTime.
>  # Hack Kafka client's system clock time with a *future time*, e.g 03/04/*2025*, 3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
>  # Send message from Kafka client to server.{quote}
> What kinds of things you need to have a look after message handled by Kafka server?
> {quote} # Check the value of timestamp in log segment **.timeindex**.* The timestamp will be *a future time after `03/04/*2025, 3:25:52 PM [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`*.   _(Let's say 00000000035957300794.log is the log segment which first receive the test client's message. It will be referenced in #3)_
>  # After testing for couples of hours, there will be lots of log segment rolled with a smaller size (e.g 50mb) than the configured segment size (e.g 1gb). 
>  # All of log segments including 00000000035957300794.* and new ones, will not be deleted over retention hours.{quote}
> What's the particular logic to cause this issue?
> {quote} # No deletable log segments will be returned from the following method.
>  private def deletableSegments(predicate: (LogSegment, Option[LogSegment]) => Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]].{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)