You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@kafka.apache.org by "Nurlan Turdaliev (Jira)" <ji...@apache.org> on 2021/06/24 13:04:00 UTC

[jira] [Comment Edited] (KAFKA-12378) If a broker is down for more then `delete.retention.ms` deleted records in a compacted topic can come back.

    [ https://issues.apache.org/jira/browse/KAFKA-12378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368821#comment-17368821 ] 

Nurlan Turdaliev edited comment on KAFKA-12378 at 6/24/21, 1:03 PM:
--------------------------------------------------------------------

Voting here, at least a warning somewhere in the docs would be good. In our case, it wasn't event the leader node shutting down. It was a follower that became leader immediately after startup (which is also suspicious, why would it do so?)


was (Author: entea):
Voting here, at least a warning somewhere in the docs would be good. In our case, it wasn't event the leader node shutting down.

> If a broker is down for more then `delete.retention.ms` deleted records in a compacted topic can come back.
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-12378
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12378
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Shane
>            Priority: Major
>
> If the leader of a compacted topic goes offline, or has replication lag longer than the `delete.retention.ms` of a topic, records that are tombstoned can come back once the leader catches up then becomes the leader.
>  
> Example of this happening:
>  Topic config:
>     name: compacted-topic
>     settings: delete.retention.ms=0
>     Leader: broker 1
>     ISR: broker 1, broker 2, broker 3
>  
> Producer 1 writes a record `1:foo` 
>  Producer 1 writes a record `2:bar` 
>  broker 1 goes offline 
>  broker 2 takes over leadership
>  Producer 1 writes a tombstone `1:NULL`
>  broker 2 compacts the topic, which leaves the topic with `1:NULL` and `2:bar` in it.
>  broker 2 removes the tombstone leaving just `2:bar` in the topic.
>  broker 1 comes back online, catches up with replication, takes back leadership
>  broker 1 now has `1:foo` and `2:bar` as the data, since the tombstone is deleted
> At this point the topic is in a strange state, as the brokers have conflicting data.
>  
>  
> Suggestion:
>  I believe this to be quite a hard problem to solve, so I'm not going to suggest any large changes to the codebase, but I think a warning in the docs about `delete.retention.ms` is warranted.
>  I think adding something that calls out that brokers are also consumers here: [https://docs.confluent.io/platform/current/installation/configuration/topic-configs.html#topicconfigs_delete.retention.ms] would be helpful, but even further documentation about what happens when a broker is offline for more than `delete.retention.ms` would be nice to see. If it helps I'm happy to take a first draft at updating the docs as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)