You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "ahmedabu98 (via GitHub)" <gi...@apache.org> on 2023/04/17 19:34:55 UTC

[GitHub] [beam] ahmedabu98 commented on a diff in pull request #26306: update the BQ limitations

ahmedabu98 commented on code in PR #26306:
URL: https://github.com/apache/beam/pull/26306#discussion_r1169192670


##########
website/www/site/content/en/documentation/io/built-in/google-bigquery.md:
##########
@@ -981,6 +981,11 @@ BigQueryIO currently has the following limitations.
    multiple BigQuery tables. The Beam SDK for Java does not have this limitation
    as it partitions your dataset for you.
 
+3. When you [load data](https://cloud.google.com/bigquery/docs/loading-data) into BigQuery, [these limits](https://cloud.google.com/bigquery/quotas#load_jobs) are applied.
+Particularly, a load job fails if it executes for longer than six hours.
+This might be caused if your BigQuery job uses a shared pool of slots.
+It is highly recommended to use [BigQuery reservations](https://cloud.google.com/bigquery/docs/reservations-intro#benefits_of_reservations).

Review Comment:
   Maybe should give some more context: by default we load data to free shared pools. On these shared pools, BigQuery makes no guarantees on available capacity so the load may wait in a queued state until a slot is freed up. Per limits, the load fails after 6 hours. 
   
   Then we can recommend using BQ reservations if reliability and predictability is important for the use-case (ie. production jobs)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org