You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Hai Lu <lh...@apache.org> on 2019/01/17 19:27:51 UTC

flink portable runner usage

Hi Thomas,

This is Hai who works on portable runner for Samza. I have a few minor
question that I would like to get clarification on from you.

We chatted briefly at last beam meetup and you mention your flink portable
runner (Python) is going into production. So today are you using Beam
Python on Flink in streaming mode or batch mode? And what are you input
sources (Kafka? Kinesis?)

Also we talked about how bundling would help lift the perf by a lot. But it
seems like flink runner today only does bundling in batch mode, not in
streaming mode. Am I missing something?

BTW, looking forward to the Beam @Lyft meetup in February!

Thanks,
Hai (LinkedIn)

Re: flink portable runner usage

Posted by Thomas Weise <th...@apache.org>.
Hello Hai,

Yes, we are working on a use case for Python/Flink that should go to
production soon. It's using the Flink runner in *streaming* mode. The
source is Kinesis, but we implemented support for Kafka also. You can find
that in our Beam fork [1]

The Flink runner supports multiple element bundles in streaming mode (for
up to 1000ms or 1000 elements by default) [2].

See you at the meetup!

Thomas

[1]
https://github.com/lyft/beam/blob/release-2.10.0-lyft/runners/flink/src/main/java/org/apache/beam/runners/flink/LyftFlinkStreamingPortableTranslations.java

[2]
https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java#L176


On Thu, Jan 17, 2019 at 11:28 AM Hai Lu <lh...@apache.org> wrote:

> Hi Thomas,
>
> This is Hai who works on portable runner for Samza. I have a few minor
> question that I would like to get clarification on from you.
>
> We chatted briefly at last beam meetup and you mention your flink portable
> runner (Python) is going into production. So today are you using Beam
> Python on Flink in streaming mode or batch mode? And what are you input
> sources (Kafka? Kinesis?)
>
> Also we talked about how bundling would help lift the perf by a lot. But
> it seems like flink runner today only does bundling in batch mode, not in
> streaming mode. Am I missing something?
>
> BTW, looking forward to the Beam @Lyft meetup in February!
>
> Thanks,
> Hai (LinkedIn)
>