You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Greg Harris (Jira)" <ji...@apache.org> on 2023/02/22 23:34:00 UTC

[jira] [Commented] (KAFKA-5827) Allow configuring Kafka sink connectors to start processing records from the end of topics

    [ https://issues.apache.org/jira/browse/KAFKA-5827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692413#comment-17692413 ] 

Greg Harris commented on KAFKA-5827:
------------------------------------

This is controllable via the Client Override feature KIP-458 [https://cwiki.apache.org/confluence/display/KAFKA/KIP-458%3A+Connector+Client+Config+Override+Policy] .
You can configure the `consumer.override.auto.offset.reset` configuration property in a connector configuration to have the consumer begin reading from the latest record in a partition. After the connector commits offsets, further restarts will pick up where the previous commit finished, avoiding data loss while not re-reading previously committed messages.

> Allow configuring Kafka sink connectors to start processing records from the end of topics
> ------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-5827
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5827
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>            Reporter: Behrang Saeedzadeh
>            Priority: Major
>
> As far as I can see, Kafka connectors start exporting data of a topic from the beginning of its partitions. We have a topic that contains a few million old records that we don't need but we would like to start exporting new records that are added to the topic.
> Basically:
> * When the connector is started for the first time and it does not have a current offset stored, it should start consuming data from the end of topic partitions
> * When the connector is restarted and has a current offset for partitions stored somewhere, it should start from those offsets



--
This message was sent by Atlassian Jira
(v8.20.10#820010)