You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2013/12/20 20:18:09 UTC

[jira] [Commented] (KAFKA-1006) Consumer loses messages of a new topic with auto.offset.reset = largest

    [ https://issues.apache.org/jira/browse/KAFKA-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13854450#comment-13854450 ] 

Guozhang Wang commented on KAFKA-1006:
--------------------------------------

Propose the following fix:

1. Add one more property in ConsumerConfig besides auto.offset.reset, named new.topic.offset.reset, which can be either largest or smallest, with default to smallest.

2. In handleTopicEvent, when new topic is added, record the new topic in a list.

3. In handleOffsetOutOfRange, if the topic is recorded as new topic, use the new config, otherwise use the global config.

4. The list will be checked/cleared on commit offsets.

> Consumer loses messages of a new topic with auto.offset.reset = largest
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-1006
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1006
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Swapnil Ghike
>            Assignee: Guozhang Wang
>
> Consumer currently uses auto.offset.reset = largest by default. If a new topic is created, consumer's topic watcher is fired. The consumer will first finish partition reassignment as part of rebalance and then start consuming from the tail of each partition. Until the partition reassignment is over, the server may have appended new messages to the new topic, consumer won't consume these messages. Thus, multiple batches of messages may be lost when a topic is newly created. 
> The fix is to start consuming from the earliest offset for newly created topics.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)