You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by Jin Yi <el...@gmail.com> on 2020/03/01 19:23:06 UTC

[Question] enable end2end Kafka exactly once processing

Hi experts,

My application is using Apache Beam and with Flink to be the runner. My
source and sink are kafka topics, and I am using KafkaIO connector provided
by Apache Beam to consume and publish.

I am reading through Beam's java doc:
https://beam.apache.org/releases/javadoc/2.16.0/org/apache/beam/sdk/io/kafka/KafkaIO.WriteRecords.html#withEOS-int-java.lang.String-

It looks like Beam does not support Flink Runner for EOS, can someone
please shad some lights on how to enable exactly once processing with
Apache Beam?

Thanks a lot!
Eleanore

Re: [Question] enable end2end Kafka exactly once processing

Posted by Arvid Heise <ar...@ververica.com>.
Hi Eleanore,

the flink runner is maintained by the Beam developers, so it's best to ask
on their user list.

The documentation is, however, very clear. "Flink runner is one of the
runners whose checkpoint semantics are not compatible with current
implementation (hope to provide a solution in near future)."
So, Beam uses a different approach to EOS than Flink and there is currently
no way around it. Maybe, you could use the EOS Kafka Sink of Flink directly
and use that in Beam somehow.

I'm not aware of any work with the Beam devs to actually make it work.
Independently, we started to improve our interfaces for two phase commit
sinks (which is our approach). It might coincidentally help Beam.

Best,

Arvid

On Sun, Mar 1, 2020 at 8:23 PM Jin Yi <el...@gmail.com> wrote:

> Hi experts,
>
> My application is using Apache Beam and with Flink to be the runner. My
> source and sink are kafka topics, and I am using KafkaIO connector provided
> by Apache Beam to consume and publish.
>
> I am reading through Beam's java doc:
> https://beam.apache.org/releases/javadoc/2.16.0/org/apache/beam/sdk/io/kafka/KafkaIO.WriteRecords.html#withEOS-int-java.lang.String-
>
> It looks like Beam does not support Flink Runner for EOS, can someone
> please shad some lights on how to enable exactly once processing with
> Apache Beam?
>
> Thanks a lot!
> Eleanore
>

Re: [Question] enable end2end Kafka exactly once processing

Posted by Arvid Heise <ar...@ververica.com>.
Hi Eleanore,

the flink runner is maintained by the Beam developers, so it's best to ask
on their user list.

The documentation is, however, very clear. "Flink runner is one of the
runners whose checkpoint semantics are not compatible with current
implementation (hope to provide a solution in near future)."
So, Beam uses a different approach to EOS than Flink and there is currently
no way around it. Maybe, you could use the EOS Kafka Sink of Flink directly
and use that in Beam somehow.

I'm not aware of any work with the Beam devs to actually make it work.
Independently, we started to improve our interfaces for two phase commit
sinks (which is our approach). It might coincidentally help Beam.

Best,

Arvid

On Sun, Mar 1, 2020 at 8:23 PM Jin Yi <el...@gmail.com> wrote:

> Hi experts,
>
> My application is using Apache Beam and with Flink to be the runner. My
> source and sink are kafka topics, and I am using KafkaIO connector provided
> by Apache Beam to consume and publish.
>
> I am reading through Beam's java doc:
> https://beam.apache.org/releases/javadoc/2.16.0/org/apache/beam/sdk/io/kafka/KafkaIO.WriteRecords.html#withEOS-int-java.lang.String-
>
> It looks like Beam does not support Flink Runner for EOS, can someone
> please shad some lights on how to enable exactly once processing with
> Apache Beam?
>
> Thanks a lot!
> Eleanore
>