You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/01/02 04:37:00 UTC

[jira] [Commented] (KAFKA-8522) Tombstones can survive forever

    [ https://issues.apache.org/jira/browse/KAFKA-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006582#comment-17006582 ] 

ASF GitHub Bot commented on KAFKA-8522:
---------------------------------------

ConcurrencyPractitioner commented on pull request #7884: [KAFKA-8522] Streamline tombstone and transaction marker removal
URL: https://github.com/apache/kafka/pull/7884
 
 
   The objective of this PR is to prevent tombstones from persisting in logs under low throughput conditions.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Tombstones can survive forever
> ------------------------------
>
>                 Key: KAFKA-8522
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8522
>             Project: Kafka
>          Issue Type: Improvement
>          Components: log cleaner
>            Reporter: Evelyn Bayes
>            Priority: Minor
>
> This is a bit grey zone as to whether it's a "bug" but it is certainly unintended behaviour.
>  
> Under specific conditions tombstones effectively survive forever:
>  * Small amount of throughput;
>  * min.cleanable.dirty.ratio near or at 0; and
>  * Other parameters at default.
> What  happens is all the data continuously gets cycled into the oldest segment. Old records get compacted away, but the new records continuously update the timestamp of the oldest segment reseting the countdown for deleting tombstones.
> So tombstones build up in the oldest segment forever.
>  
> While you could "fix" this by reducing the segment size, this can be undesirable as a sudden change in throughput could cause a dangerous number of segments to be created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)