You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@beam.apache.org by Alexey Romanenko <ar...@gmail.com> on 2022/10/26 16:08:56 UTC

Re: Request to suggest alternative approaches for side input use cases in apache beam

Well, it depends on how you do use a Redis cache and how often it’s changing. 

For example, if you need to request a cache for a group of input records then you can group them into batches and do only one remote call to cache before processing this batch, like explained here [1]

In any case, the more details about your use-case and why side inout approach doesn’t work well for you would be helpful.

[1] https://beam.apache.org/documentation/patterns/grouping-elements-for-efficient-external-service-calls/ <https://beam.apache.org/documentation/patterns/grouping-elements-for-efficient-external-service-calls/>

—
Alexey

> On 29 Sep 2022, at 15:10, Chinni, Madhavi via user <us...@beam.apache.org> wrote:
> 
> Hi,
>  
> We have a stream processing pipeline which process the customer UI interactions data .
> As part of the pipeline we read the information from AWS redis cache and store it in a PCollectionView. The PCollectionView is accessed as side input in the next CombineFnWithContext accumulators and transform functions in the pipeline.
> Could you please suggest an alternative approach where we can avoid using side input for accessing redis cache information in next functions in the pipeline.
>  
> Thanks,
> Madhavi