You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by John Casey via dev <de...@beam.apache.org> on 2022/07/15 18:10:06 UTC

Disabling Kafka IO SDF implementation

Hi All,

There is an issue right now where Kafka IO's SDF implementation isn't
resuming properly when the pipeline restarts
https://github.com/apache/beam/issues/21730.

In addition, there was an issue where Kafka SDF wasn't committing properly
when 'commit in finalize' was specified, and I believe there may also be an
issue with restriction tracking, though I haven't confirmed that.

Because of these issues, I don't have a good degree of trust in Kafka SDF,
and because these aren't edge cases I don't believe there are many Kafka
SDF users at the moment.

As such, I've raised https://github.com/apache/beam/pull/22261 to disable
Kafka SDF temporarily (Thanks @Chamikara Jayalath <ch...@google.com> for
setting up the new experiment that will allow Kafka Unbounded to continue
working wrapped in SDF for runners that require SDF)

In addition, https://github.com/apache/beam/issues/22303 tracks the work
required to test, fix, and ensure that Kafka SDF is stable.

Thanks,
John

Re: Disabling Kafka IO SDF implementation

Posted by Chamikara Jayalath via dev <de...@beam.apache.org>.
Thanks John.

+1 for disabling SDF Kafka till we are confident about it. As John
mentioned, we already have the ability to wrap UnboundedSource based Kafka
with SDF so this should not break Kafka support for existing runners that
require SDF.

- Cham

On Fri, Jul 15, 2022 at 11:10 AM John Casey <jo...@google.com> wrote:

> Hi All,
>
> There is an issue right now where Kafka IO's SDF implementation isn't
> resuming properly when the pipeline restarts
> https://github.com/apache/beam/issues/21730.
>
> In addition, there was an issue where Kafka SDF wasn't committing properly
> when 'commit in finalize' was specified, and I believe there may also be an
> issue with restriction tracking, though I haven't confirmed that.
>
> Because of these issues, I don't have a good degree of trust in Kafka SDF,
> and because these aren't edge cases I don't believe there are many Kafka
> SDF users at the moment.
>
> As such, I've raised https://github.com/apache/beam/pull/22261 to disable
> Kafka SDF temporarily (Thanks @Chamikara Jayalath <ch...@google.com> for
> setting up the new experiment that will allow Kafka Unbounded to continue
> working wrapped in SDF for runners that require SDF)
>
> In addition, https://github.com/apache/beam/issues/22303 tracks the work
> required to test, fix, and ensure that Kafka SDF is stable.
>
> Thanks,
> John
>