You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Alexis Sarda-Espinosa <sa...@gmail.com> on 2022/10/11 20:10:28 UTC

Partial broadcast/keyed connected streams

Hi everyone,

I am currently thinking about a use case for a streaming job and, while I'm
fairly certain it cannot be done with the APIs that Flink currently
provides, I figured I'd put it out there in case other users think
something like this would be useful to a wider audience.

The current broadcasting mechanisms offered by Flink mention use cases
where "control events" are needed. In my case I would also have control
events, and I would need to broadcast them to *all parallel instances* of
any downstream operators that consume the events. However, some of those
operators have to be keyed because they are stateful. From the API's point
of view, I'd imagine something like

controlStream.connect(mainStream).broadcastFirstKeySecondBy(keySelector).process(PartiallyKeyedCoProcessFunction)

The function would also have something like processElement1 and
processElement2, but one of those methods wouldn't have access to
partitioned state (or could it have access to state for all key groups
handled by that instance?).

Since I'm not familiar with all of Flink's internals, I don't know if this
would be even remotely feasible, but I'd like to know if others have
opinions on this.

Regards,
Alexis.

Re: Partial broadcast/keyed connected streams

Posted by Alexis Sarda-Espinosa <sa...@gmail.com>.
Oh wow, I had read that documentation so many times and I was sure that API
also expected the broadcasted side to have a key like the other side, but
that's not the case, that is already what I was thinking of. Thanks.

Regards,
Alexis.

On Wed, 12 Oct 2022, 03:42 仙路尽头谁为峰, <xl...@163.com> wrote:

> Hi Alexis:
>
>        The broadcast state pattern should be done by calling connect() on
> the non-broadcasted stream, with the *broadcaststream* as an argument.
>
>        And if the main stream is keyedStream, then the processElement
> function will have access to any keyed state as normal keyedstream.
>
>
>
> Best Regards!
>
> 从 Windows 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>发送
>
>
>
> *发件人: *Alexis Sarda-Espinosa <sa...@gmail.com>
> *发送时间: *2022年10月12日 4:11
> *收件人: *user <us...@flink.apache.org>
> *主题: *Partial broadcast/keyed connected streams
>
>
>
> Hi everyone,
>
>
>
> I am currently thinking about a use case for a streaming job and, while
> I'm fairly certain it cannot be done with the APIs that Flink currently
> provides, I figured I'd put it out there in case other users think
> something like this would be useful to a wider audience.
>
>
>
> The current broadcasting mechanisms offered by Flink mention use cases
> where "control events" are needed. In my case I would also have control
> events, and I would need to broadcast them to *all parallel instances* of
> any downstream operators that consume the events. However, some of those
> operators have to be keyed because they are stateful. From the API's point
> of view, I'd imagine something like
>
>
>
>
> controlStream.connect(mainStream).broadcastFirstKeySecondBy(keySelector).process(PartiallyKeyedCoProcessFunction)
>
>
>
> The function would also have something like processElement1 and
> processElement2, but one of those methods wouldn't have access to
> partitioned state (or could it have access to state for all key groups
> handled by that instance?).
>
>
>
> Since I'm not familiar with all of Flink's internals, I don't know if this
> would be even remotely feasible, but I'd like to know if others have
> opinions on this.
>
>
>
> Regards,
>
> Alexis.
>
>
>
>
>

回复: Partial broadcast/keyed connected streams

Posted by 仙路尽头谁为峰 <xl...@163.com>.
Hi Alexis:
	The broadcast state pattern should be done by calling connect() on the non-broadcasted stream, with the broadcaststream as an argument.
	And if the main stream is keyedStream, then the processElement function will have access to any keyed state as normal keyedstream.

Best Regards!
从 Windows 版邮件发送

发件人: Alexis Sarda-Espinosa
发送时间: 2022年10月12日 4:11
收件人: user
主题: Partial broadcast/keyed connected streams

Hi everyone,

I am currently thinking about a use case for a streaming job and, while I'm fairly certain it cannot be done with the APIs that Flink currently provides, I figured I'd put it out there in case other users think something like this would be useful to a wider audience.

The current broadcasting mechanisms offered by Flink mention use cases where "control events" are needed. In my case I would also have control events, and I would need to broadcast them to *all parallel instances* of any downstream operators that consume the events. However, some of those operators have to be keyed because they are stateful. From the API's point of view, I'd imagine something like

controlStream.connect(mainStream).broadcastFirstKeySecondBy(keySelector).process(PartiallyKeyedCoProcessFunction)

The function would also have something like processElement1 and processElement2, but one of those methods wouldn't have access to partitioned state (or could it have access to state for all key groups handled by that instance?).

Since I'm not familiar with all of Flink's internals, I don't know if this would be even remotely feasible, but I'd like to know if others have opinions on this.

Regards,
Alexis.