You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Noel OConnor <no...@gmail.com> on 2022/09/17 16:29:22 UTC

Enrichment of stream from another stream

Hi,
I'm trying to determine the best way to enrich the event payload of a
fast moving incoming stream with values in another stream which is far
more slow moving.
I'm converting the second stream into a table for continuous query
functionality and I wonder what is the best way to take the values of
that query and enrich the fast moving stream.

Is it best to store the output of the continuous query in a value
state and access this in a process function being applied to the fast
moving stream?
Or do I execute the query on the table created by the slow moving
stream as part of a map function on the fast moving stream.

I suspect there's multiple ways to do this but I want to use the more
appropriate method for flink.


cheers
Noel

Re: Enrichment of stream from another stream

Posted by Noel OConnor <no...@gmail.com>.
Thanks Marco,
I'll give it a try.

cheers
Noel

On Sat, Sep 17, 2022 at 7:14 PM Marco Villalobos
<mv...@kineteque.com> wrote:
>
> I might need more details, but conceptually, streams can be thought of as never ending tables
> and our code as functions applied to them.
>
> JOIN is a concept supported in the SQL API and DataStream API.
>
> However, the SQL API is more succinct (unlike my writing ;).
>
> So, how about the "fast stream" mapped to an SQL API Table and
> the "slow" table mapped to SQL API versioned table that is joined with a "temporal join."
>
> I'd try to use the SQL for the first part of the job to make this join, and then if I need the DataStream API convert it.
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#temporal-joins
>
>
> On Sep 17, 2022, at 9:29 AM, Noel OConnor <no...@gmail.com> wrote:
>
> Hi,
> I'm trying to determine the best way to enrich the event payload of a
> fast moving incoming stream with values in another stream which is far
> more slow moving.
> I'm converting the second stream into a table for continuous query
> functionality and I wonder what is the best way to take the values of
> that query and enrich the fast moving stream.
>
> Is it best to store the output of the continuous query in a value
> state and access this in a process function being applied to the fast
> moving stream?
> Or do I execute the query on the table created by the slow moving
> stream as part of a map function on the fast moving stream.
>
> I suspect there's multiple ways to do this but I want to use the more
> appropriate method for flink.
>
>
> cheers
> Noel
>
>

Re: Enrichment of stream from another stream

Posted by Marco Villalobos <mv...@kineteque.com>.
I might need more details, but conceptually, streams can be thought of as never ending tables
and our code as functions applied to them.

JOIN is a concept supported in the SQL API and DataStream API.

However, the SQL API is more succinct (unlike my writing ;).

So, how about the "fast stream" mapped to an SQL API Table and 
the "slow" table mapped to SQL API versioned table that is joined with a "temporal join."

I'd try to use the SQL for the first part of the job to make this join, and then if I need the DataStream API convert it.

https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#temporal-joins <https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#temporal-joins>


> On Sep 17, 2022, at 9:29 AM, Noel OConnor <no...@gmail.com> wrote:
> 
> Hi,
> I'm trying to determine the best way to enrich the event payload of a
> fast moving incoming stream with values in another stream which is far
> more slow moving.
> I'm converting the second stream into a table for continuous query
> functionality and I wonder what is the best way to take the values of
> that query and enrich the fast moving stream.
> 
> Is it best to store the output of the continuous query in a value
> state and access this in a process function being applied to the fast
> moving stream?
> Or do I execute the query on the table created by the slow moving
> stream as part of a map function on the fast moving stream.
> 
> I suspect there's multiple ways to do this but I want to use the more
> appropriate method for flink.
> 
> 
> cheers
> Noel