You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Sigalit Eliazov <e....@gmail.com> on 2024/02/14 13:30:01 UTC

Beam SQL JOIN with Side Inputs

hi,

We are currently working on a use case that involves streaming usage data
arriving every 5 minutes, and we have a few dimension tables that undergo
changes once a day.

Our attempt to implement a join between these tables using Beam SQL
encountered a limitation. Specifically, Beam SQL requires both unbounded
streams to share the same windows without allowed lateness.

Since our "side input" collection updates daily, we are unable to set both
collections to the same non-global windows.

Our question is whether it is possible to achieve this using Beam SQL, or
if we need to implement a custom solution to address this challenge.
meaning implement DoFn with side inputs.

In addition, is there any limitations, beyond performance considerations,
when it comes to joining 10 different streams


Thanks, Sigalit