You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Daniel Mateus Pires <dm...@gmail.com> on 2022/11/15 10:52:50 UTC

BigQuery Storage API read hit Quota

hey we have a big job that fails after it hits BigQuery Storage API Read
quota limits, we’ve increased it once already but it keeps happening.
When a job fails we see a spike of 429 s on the API Monitoring page for
“ReadRows” and eventually we see "The worker lost contact with the service
in Dataflow.."
Shouldn’t this PR [1] on the Beam project have prevented that issue from
happening?

Is there a way for us to throttle our requests and keep that from happening
on the Beam side?

[1] https://github.com/apache/beam/pull/15445

Re: BigQuery Storage API read hit Quota

Posted by Daniel Mateus Pires <dm...@gmail.com>.
Hi Kerry,

Thanks for the follow up, Michel Davit from the SCIO Slack workspace was
helpful in pointing out that this PR was actually not included in the code
I was running since it's only available in  v2.43.0-RC2 right now.

I might try to run the release candidate if I get to it, but my code runs
through the SCIO library so it might not be trivial!

Once there is a release available I'll try it out and update this email
thread

On Tue, 15 Nov 2022 at 17:43, Kerry Donny-Clark via user <
user@beam.apache.org> wrote:

> Damon (damondouglas) is working on a global throttling solution for Beam.
> There should be a design doc shared here in a week or two.
> However, you're right that the PR referenced should have addressed this.
> It may be due for an update. Do you have time to take a look at the code
> and see if it needs something added?
> Kerry
>
> On Tue, Nov 15, 2022 at 5:53 AM Daniel Mateus Pires <dm...@gmail.com>
> wrote:
>
>> hey we have a big job that fails after it hits BigQuery Storage API Read
>> quota limits, we’ve increased it once already but it keeps happening.
>> When a job fails we see a spike of 429 s on the API Monitoring page for
>> “ReadRows” and eventually we see "The worker lost contact with the service
>> in Dataflow.."
>> Shouldn’t this PR [1] on the Beam project have prevented that issue from
>> happening?
>>
>> Is there a way for us to throttle our requests and keep that from
>> happening on the Beam side?
>>
>> [1] https://github.com/apache/beam/pull/15445
>>
>

Re: BigQuery Storage API read hit Quota

Posted by Kerry Donny-Clark via user <us...@beam.apache.org>.
Damon (damondouglas) is working on a global throttling solution for Beam.
There should be a design doc shared here in a week or two.
However, you're right that the PR referenced should have addressed this. It
may be due for an update. Do you have time to take a look at the code and
see if it needs something added?
Kerry

On Tue, Nov 15, 2022 at 5:53 AM Daniel Mateus Pires <dm...@gmail.com>
wrote:

> hey we have a big job that fails after it hits BigQuery Storage API Read
> quota limits, we’ve increased it once already but it keeps happening.
> When a job fails we see a spike of 429 s on the API Monitoring page for
> “ReadRows” and eventually we see "The worker lost contact with the service
> in Dataflow.."
> Shouldn’t this PR [1] on the Beam project have prevented that issue from
> happening?
>
> Is there a way for us to throttle our requests and keep that from
> happening on the Beam side?
>
> [1] https://github.com/apache/beam/pull/15445
>