You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Tzu-Li (Gordon) Tai (Jira)" <ji...@apache.org> on 2022/06/29 17:45:00 UTC

[jira] [Comment Edited] (FLINK-28303) Kafka SQL Connector loses data when restoring from a savepoint with a topic with empty partitions

    [ https://issues.apache.org/jira/browse/FLINK-28303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17560595#comment-17560595 ] 

Tzu-Li (Gordon) Tai edited comment on FLINK-28303 at 6/29/22 5:44 PM:
----------------------------------------------------------------------

[~rmetzger] have you tried setting the "auto.offset.reset" property to "earliest"? The default of that value is "latest".
That config dictates the position to start from for a partition when 1) no records have been consumed yet from a partition, and 2) when an attempted offset to read from is out of bounds.

If the records are consistently present after changing that config to "earliest", then it should prove your theory.
In that case, we should fix the Kafka connector so that as soon as a partition is discovered, we should already write its offset into state, probably with a special "earliest" value/marker of some sort.


was (Author: tzulitai):
[~rmetzger] have you tried setting the "auto.offset.reset" property to "earliest"? The default of that value is "latest".
That config dictates the position to start from for a partition when 1) no records have been consumed yet from a partition, and 2) when an attempted offset to read from is out of bounds.

If the records are present after changing that config to "earliest", then it should prove your theory.
In that case, we should fix the Kafka connector so that as soon as a partition is discovered, we should already write its offset into state, probably with a special "earliest" value/marker of some sort.

> Kafka SQL Connector loses data when restoring from a savepoint with a topic with empty partitions
> -------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-28303
>                 URL: https://issues.apache.org/jira/browse/FLINK-28303
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka
>    Affects Versions: 1.14.4
>            Reporter: Robert Metzger
>            Priority: Major
>
> Steps to reproduce:
> - Set up a Kafka topic with 10 partitions
> - produce records 0-9 into the topic
> - take a savepoint and stop the job
> - produce records 10-19 into the topic
> - restore the job from the savepoint.
> The job will be missing usually 2-4 records from 10-19.
> My assumption is that if a partition never had data (which is likely with 10 partitions and 10 records), the savepoint will only contain offsets for partitions with data. 
> While the job was offline (and we've written record 10-19 into the topic), all partitions got filled. Now, when Kafka comes online again, it will use the "latest" offset for those partitions, skipping some data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)