You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Sofia’s World <mm...@gmail.com> on 2021/09/24 09:28:54 UTC

Limit the concurrency of a Beam Step (or all the steps)

Hello
 i was wondering if it's somehow possible to limit the concurrency of a
beam Step?
i have a workflow which involves a Webclient that uses an API for which  my
account has a max of 300/requests per minute...
Alternatively, will i have to go through a combine and custom ParDo ?

Has anyone came across this Usecase>?

kind regards
Marco

Re: Limit the concurrency of a Beam Step (or all the steps)

Posted by Cristian Constantinescu <ze...@gmail.com>.
Hey Marco,

Other more senior people can correct me here. About limiting the
concurrency aspect of things. Beam/Runners split PCollections of <Key,
Value> by Key. So as long as all your items have the same key, I think it
will create only one executor for that ParDo. So that's what I did recently:

1. create a PTransform (to hide this logic from the rest of the pipeline)
2. In the PTransform apply: input.apply(WithKeys.of("")) (basically making
all the items go to the same executor)
3. In the PTransform create a private DoFn
4. the DoFn has a Timer and a State as Kenn's blog linked by Brian.
5. whenever the timer expires (a minute in your case), grab the first 300
items from your state and make service calls.

Hope it helps,
Cristian

On Fri, Sep 24, 2021 at 12:22 PM Brian Hulette <bh...@google.com> wrote:

> Kenn's Timely Processing blog [1] post discusses this use case.
>
> Brian
>
> [1] https://beam.apache.org/blog/timely-processing/
>
> On Fri, Sep 24, 2021 at 4:12 AM Evan Galpin <ev...@gmail.com> wrote:
>
>> This has been mentioned a few times and seems to me to be a fairly common
>> requirement.
>>
>> I think that a rate limit could be accomplished through stateful
>> processing, using a combination of bagState and Timers.
>> GroupIntoBatches.java would be a good example.
>>
>> I wonder if this would be a good built-in transform given the number of
>> times it’s come up 🙂
>>
>> Thanks,
>> Evan
>>
>> On Fri, Sep 24, 2021 at 05:29 Sofia’s World <mm...@gmail.com> wrote:
>>
>>> Hello
>>>  i was wondering if it's somehow possible to limit the concurrency of a
>>> beam Step?
>>> i have a workflow which involves a Webclient that uses an API for which
>>> my account has a max of 300/requests per minute...
>>> Alternatively, will i have to go through a combine and custom ParDo ?
>>>
>>> Has anyone came across this Usecase>?
>>>
>>> kind regards
>>> Marco
>>>
>>

Re: Limit the concurrency of a Beam Step (or all the steps)

Posted by Brian Hulette <bh...@google.com>.
Kenn's Timely Processing blog [1] post discusses this use case.

Brian

[1] https://beam.apache.org/blog/timely-processing/

On Fri, Sep 24, 2021 at 4:12 AM Evan Galpin <ev...@gmail.com> wrote:

> This has been mentioned a few times and seems to me to be a fairly common
> requirement.
>
> I think that a rate limit could be accomplished through stateful
> processing, using a combination of bagState and Timers.
> GroupIntoBatches.java would be a good example.
>
> I wonder if this would be a good built-in transform given the number of
> times it’s come up 🙂
>
> Thanks,
> Evan
>
> On Fri, Sep 24, 2021 at 05:29 Sofia’s World <mm...@gmail.com> wrote:
>
>> Hello
>>  i was wondering if it's somehow possible to limit the concurrency of a
>> beam Step?
>> i have a workflow which involves a Webclient that uses an API for which
>> my account has a max of 300/requests per minute...
>> Alternatively, will i have to go through a combine and custom ParDo ?
>>
>> Has anyone came across this Usecase>?
>>
>> kind regards
>> Marco
>>
>

Re: Limit the concurrency of a Beam Step (or all the steps)

Posted by Evan Galpin <ev...@gmail.com>.
This has been mentioned a few times and seems to me to be a fairly common
requirement.

I think that a rate limit could be accomplished through stateful
processing, using a combination of bagState and Timers.
GroupIntoBatches.java would be a good example.

I wonder if this would be a good built-in transform given the number of
times it’s come up 🙂

Thanks,
Evan

On Fri, Sep 24, 2021 at 05:29 Sofia’s World <mm...@gmail.com> wrote:

> Hello
>  i was wondering if it's somehow possible to limit the concurrency of a
> beam Step?
> i have a workflow which involves a Webclient that uses an API for which
> my account has a max of 300/requests per minute...
> Alternatively, will i have to go through a combine and custom ParDo ?
>
> Has anyone came across this Usecase>?
>
> kind regards
> Marco
>