You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Jason Gustafson (JIRA)" <ji...@apache.org> on 2017/06/21 22:36:00 UTC

[jira] [Commented] (KAFKA-5490) Deletion of tombstones during cleaning should consider idempotent message retention

    [ https://issues.apache.org/jira/browse/KAFKA-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058381#comment-16058381 ] 

Jason Gustafson commented on KAFKA-5490:
----------------------------------------

After thinking about this, the cleanest option seems to be to do what we proposed in the original KIP. We remove the record as per the normal cleaning logic, but we retain the batch even if it remains empty. On later cleaning passes, we would remove the empty batches after the producerId had either expired or had appended an entry with a new epoch or sequence. To support this, we would need a slight alteration to the message format semantics to support empty batches. One possibility is to use offsetDelta=-1.

A couple other options that we have considered:

1. We can retain the record, but change the value to null (i.e. make it a tombstone). 
2. We can retain the record, but add a record-level attribute indicating that the entry is pending deletion.

Option 1 is probably off the table since it likely violates existing semantics: a consumer materializing the records into a cache, for example, will see a state which never actually existed if it reads up to the modified record. Option 2 seems viable, but like the option to retain empty batches, it requires a change to message format semantics which would mean likely postponing a fix until 0.11.1 (or whatever major release comes next). The advantage of the empty batch approach is that we preserve existing log cleaner invariants: i.e. no duplicate keys before the dirty point. Also it does not require any attribute use.

> Deletion of tombstones during cleaning should consider idempotent message retention
> -----------------------------------------------------------------------------------
>
>                 Key: KAFKA-5490
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5490
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: clients, core, producer 
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Critical
>             Fix For: 0.11.0.1
>
>
> The LogCleaner always preserves the message containing last sequence from a given ProducerId when doing a round of cleaning. This is necessary to ensure that the producer is not prematurely evicted which would cause an OutOfOrderSequenceException. The problem with this approach is that the preserved message won't be considered again for cleaning until a new message with the same key is written to the topic. Generally this could result in accumulation of stale entries in the log, but the bigger problem is that the newer entry with the same key could be a tombstone. If we end up deleting this tombstone before a new record with the same key is written, then the old entry will resurface. For example, suppose the following sequence of writes:
> 1. ProducerId=1, Key=A, Value=1
> 2. ProducerId=2, Key=A, Value=null (tombstone)
> We will preserve the first entry indefinitely until a new record with Key=A is written AND either ProducerId 1 has written a newer record with a larger sequence number or ProducerId 1 becomes expired. As long as the tombstone is preserved, there is no correctness violation: a consumer reading from the beginning will ignore the first entry after reading the tombstone. But it is possible that the tombstone entry will be removed from the log before a new record with Key=A is written. If that happens, then a consumer reading from the beginning would incorrectly observe the overwritten value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)