You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Piotr Domagalski <pi...@domagalski.com> on 2022/05/12 09:28:08 UTC

At-least once sinks and their behaviour in a non-failure scenario

Hi,

I'm planning to build a pipeline that is using Kafka source, some stateful
transformation and a RabbitMQ sink. What I don't yet fully understand is
how common should I expect the "at-least once" scenario (ie. seeing
duplicates) on the sink side. The case when things start failing is clear
to me, but what happens when I want to gracefully stop the Flink job?

Am I right in thinking that when I gracefully stop a job with a final
savepoint [1] then what happens is that Kafka source stops consuming, a
checkpoint barrier is sent through the pipeline and this will flush the
sink completely? So my understanding is that if nothing fails and that
Kafka offset is committed, when the job is started again from that
savepoint, it will not result in any duplicates being sent to RabbitMQ. Is
that correct?

Thanks!

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-final-savepoint

-- 
Piotr Domagalski

Re: At-least once sinks and their behaviour in a non-failure scenario

Posted by Alexander Preuß <al...@ververica.com>.
Hi Piotr,

You are correct regarding the Savepoint, there should be no duplicates sent
to RabbitMQ.

Best regards,
Alexander

On Thu, May 12, 2022 at 11:28 AM Piotr Domagalski <pi...@domagalski.com>
wrote:

> Hi,
>
> I'm planning to build a pipeline that is using Kafka source, some stateful
> transformation and a RabbitMQ sink. What I don't yet fully understand is
> how common should I expect the "at-least once" scenario (ie. seeing
> duplicates) on the sink side. The case when things start failing is clear
> to me, but what happens when I want to gracefully stop the Flink job?
>
> Am I right in thinking that when I gracefully stop a job with a final
> savepoint [1] then what happens is that Kafka source stops consuming, a
> checkpoint barrier is sent through the pipeline and this will flush the
> sink completely? So my understanding is that if nothing fails and that
> Kafka offset is committed, when the job is started again from that
> savepoint, it will not result in any duplicates being sent to RabbitMQ. Is
> that correct?
>
> Thanks!
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-final-savepoint
>
> --
> Piotr Domagalski
>


-- 

Alexander Preuß | Engineer - Data Intensive Systems

alexanderpreuss@ververica.com

<https://www.ververica.com/>


Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--

Ververica GmbH

Registered at Amtsgericht Charlottenburg: HRB 158244 B

Managing Directors: Karl Anton Wehner, Holger Temme, Yip Park Tung Jason,
Jinwei (Kevin) Zhang